Technology

80632 readers

3514 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

521

This Tool Searches the Epstein Files For Your LinkedIn Contacts (www.404media.co)

submitted 17 hours ago* (last edited 17 hours ago) by tonytins@pawb.social to c/technology@lemmy.world

31 comments fedilink hide all child comments

A new tool searches your LinkedIn connections for people who are mentioned in the Epstein files, just in case you don’t, understandably, want anything to do with them on the already deranged social network.

404 Media tested the tool, called EpsteIn—as in, a mash up of Epstein and LinkedIn—and it appears to work.

“I found myself wondering whether anyone had mapped Epstein's network in the style of LinkedIn—how many people are 1st/2nd/3rd degree connections of Jeffrey Epstein?” Christopher Finke, the creator of the tool, told 404 Media in an email. “Smarter programmers than me have already built tools to visualize that, but I couldn't find anything that would show the overlap between my network and his.”

Archive: http://archive.today/AIkL2
Github: https://github.com/cfinke/EpsteIn

you are viewing a single comment's thread
view the rest of the comments

[–] Sterile_Technique@lemmy.world 4 points 11 hours ago (2 children)

Is there a tool that crunches the entirety of the documents and sorts the individual words by frequency? For example, doing it the stupid way (semi-manually) I copied OP's article into Word and replaced every space with a page break to turn the entire article into a one-word-per-line list, then plugged that into Excel and sorted alphabetically, then manually counted and deleted the repeats. Then sorted those to put the most frequent on top.

This reduced the 525 word article down to a list of 284 individual words. If I added another article to this list, the number of entries would only be increased by the number of words in the 2nd article that didn't appear in the first one, so basically as more and more articles are added, the number of unique additions from each would be fewer and fewer. Do this to a thousands-of-pages of documents like the Epstein files, and you could instantly condense like dozens of pages worth of just the word "the" down to a single entry, making the entirety of the documents much easier to skim for highlights... like, if the word 'velociraptor' was just randomly hidden in the article, most readers would probably skim right passed it; but in the list below it would stand out like a sore thumb, prompting a targeted search in the full document for context. Especially if we could flag words as not interesting, and like click to knock "the" "of" "and" etc off the list.

...maybe a project for someone who actually knows what they're doing... my skills hit a brick wall after things like 'find and replace' in Word, but you get the gist.

Word used:	# found:
The	37
Of	16
And	14
To	14
Epstein	11
In	11
Tool	9
A	8
I	8
Files	7
But	5
For	5
Is	5
Linkedin	5
Many	5
On	5
That	5
With	5
404	4
Also	4
An	4
Connections	4
Found	4
Media	4
Not	4
People	4
All	3
Anything	3
Are	3
As	3
Him	3
It	3
My	3
Network	3
Them	3
Were	3
Who	3
Already	2
Appears	2
Case	2
Common	2
Con	2
Def	2
Documents	2
DOJ	2
Dump	2
Each	2
Excerpts	2
Find	2
Finke	2
Founder	2
From	2
How	2
Jeffrey	2
Me	2
Mentioned	2
Moss	2
Name	2
Names	2
Obviously	2
Other	2
Overlap	2
Page	2
Positives	2
Repository	2
Said	2
Search	2
Their	2
This	2
Up	2
Vincenzo	2
Work	2
Your	2
5	1
22	1
35	1
1st	1
2nd	1
3rd	1
Acknowledges	1
Across	1
Adam	1
Added	1
After	1
Although	1
Anyone	1
Api	1
Appearance	1
Approached	1
Attended	1
Audio	1
Away	1
Badges	1
Based	1
Be	1
Because	1
Behind	1
Between	1
Brin	1
Built	1
Called	1
Can	1
Chose	1
Christopher	1
Company	1
Conference	1
Contained	1
Contains	1
Context	1
Could	1
Couldn't	1
Court	1
Covered	1
Co-Worker	1
Creator	1
Days	1
Deep	1
Degree	1
Department	1
Deranged	1
Did	1
Didn’t	1
Do	1
Document	1
Does	1
Don’t	1
Down	1
Duggan	1
Easily	1
Elites	1
Email	1
Epstein's	1
Far	1
First	1
Free	1
Fully	1
Ghislaine	1
Girls	1
Github	1
Gut	1
Hacker	1
Hacking	1
Had	1
Have	1
He	1
His	1
Hits	1
Images	1
Incidental	1
Included	1
Inclusion	1
Initial	1
Introduce	1
Investigations	1
Involvement	1
Iozzo	1
Jeff	1
Just	1
Justice’s	1
Keep	1
Know	1
Known	1
Larry	1
Last	1
Likely	1
Links	1
Lot	1
Made	1
Make	1
Mapped	1
Mash	1
Massive	1
Matching	1
Material	1
Maxwell	1
May	1
Mean	1
Mention	1
Mentions	1
Mentions	1
Mentions	1
Million	1
Moss’s	1
Multiple	1
Musk’s	1
Myself	1
Necessarily	1
Nefarious	1
Never	1
New	1
No	1
Nude	1
Number	1
Off	1
Offered	1
Only	1
Or	1
Original	1
Others	1
Output	1
Pages	1
Paid	1
Patrick	1
Peter	1
Photos	1
Pointed	1
Position	1
Post	1
Previous	1
Produce	1
Programmers	1
Publicly	1
Published	1
Purposefully	1
Reads	1
Realize	1
Recordings	1
Reddit	1
Related	1
Released	1
Relevance	1
Report	1
Reported	1
Result’s	1
Review	1
S	1
Saw	1
Scenes	1
Searched	1
Searches	1
Sergey	1
Show	1
Shows	1
Smarter	1
Social	1
Some	1
Stay	1
Stuff	1
Style	1
Suppose	1
Surprising	1
Taking	1
Tech	1
Tested	1
Than	1
Thankfully	1
There	1
These	1
Thiel	1
Those	1
Told	1
Tools	1
Total	1
Touch	1
Tried	1
Trusting	1
Understandably	1
Unredacted	1
Upload	1
Verify	1
Very	1
Videos	1
Visualize	1
Want	1
Warn	1
Way	1
We	1
Wealth	1
Website	1
Week	1
Well	1
Went	1
Where	1
Whether	1
Wikipedia	1
Wild	1
Wired	1
Women	1
Wondering	1
Would	1
Wrote	1
You	1
Zero	1

[–] thanks_shakey_snake@lemmy.ca 2 points 6 hours ago

Seriously, if you're motivated enough to do this, you should give programming a try. Python or Ruby or Javascript are ideal for this kind of thing, and you can solve problems like this in a few lines of code... just look up "word frequency in Python" or whatever language for examples.

If you want to see what the next level of this kind of analysis looks like, watch a few videos about how Elasticsearch works... not so much so you can USE Elasticsearch (although you can, it's free), but just to get a sense of how they approach problems like this: Like imagine instead of just counting word occurrences, you kept track of WHERE in the text the word was. You could still count the number of occurrences, but also find surrounding text and do a bunch of other interesting things too.

[–] mlg@lemmy.world 4 points 10 hours ago

There's probably a nice shell multiline command that does what you want lol. cat + awk unique count + sort

I'm just forgetting is there's an easy way to keep the line numbers or filename so you can easily go back to the full page reference.