jandrew13

joined 1 day ago
[–] jandrew13@lemmy.world 1 points 2 hours ago (2 children)

What does this contain? anything new?

[–] jandrew13@lemmy.world 1 points 3 hours ago

I've been thinking a lot about this whole thing. I don't want to be worried or fearful here - we have done nothing wrong! Anything we have archived was provided to us directly by them in the first place. There are whispers all over the internet, random torrents being passed around, conspiracies, etc., but what are we actually doing other than freaking ourselves out (myself at least) and going viral with an endless stream of "OMG LOOK AT THIS FILE" videos/posts.

I vote to remove any of the 'concerning' files and backfill with blank placeholder PDFS with justification, then collect everything we have so far, create file hashes, and put out a clean + stable archive on everything we have so far. a safe indexed archive We wipe away any concerns and can proceed methodically through blood trail of documents, resulting in an obvious and accessible collection of evidence. From there we can actually start organizing to create a tool that can be used to crowd source tagging, timestamping, and parsing the data. I'm a developer and am happy to offer my skillset.

Taking a step back - Its fun to do the "digital sleuth" thing for a while, but then what? We have the files..(mostly).. Great. We all have our own lives, jobs, and families, and taking actual time to dig into this and produce a real solution that can actually make a difference is a pretty big ask. That said, this feels like a moment where we finally can make an actual difference and I think its worth committing to. If any of you are interested in helping beyond archival, please lmk.

I just downloaded matrix, but I'm new to this, so I'm not sure how that all works. Happy to link up via discord, matrix, email, or whatever.

[–] jandrew13@lemmy.world 1 points 18 hours ago* (last edited 18 hours ago) (1 children)

this dude on pastebin posted his filetree in his epstein ubuntu env - i have a high confidence in whatever lives in his DataSet9Complete.zip file haha

[–] jandrew13@lemmy.world 2 points 21 hours ago

Have a scraper running on web.archive.org pulling all previously posted Court-Records and FOIA (docs,audio,etc.) from Jan 30th

[–] jandrew13@lemmy.world 5 points 23 hours ago (2 children)

Holy shit

The entire Court Records and FOIA page is completely gone too! Fuckers!

[–] jandrew13@lemmy.world 1 points 23 hours ago

Does anyone have the OTHER data sets from before? Ive been lasered in on the DS1-DS12 but havent looked at the other documents at all

[–] jandrew13@lemmy.world 1 points 23 hours ago (2 children)

this is ridiculous. Good thing we got in when we did!

[–] jandrew13@lemmy.world 4 points 1 day ago (3 children)

While I feel hopeful that we will be able to reconstruct the archive and create some sort of baseline that can be put back out there, I also cant stop thinking about the "and then what" aspect here. We've see our elected officials do nothing with this info over and over again and I'm worried this is going to repeat itself.

I'm fully open to input on this, but I think having a group path forward is useful here. These are the things I believe we can do to move the needle.

Right Now:

  1. Create a clean Data Archive for each of the known datasets (01-12). Something that is actually organized and accessible.
  2. Create a working Archive Directory containing an "itemized" reference list (SQL DB?) the full Data Archive, with each document's listed as a row with certain metadata. Imagining a Github repo that we can all contribute to as we work. -- File number -- Dir. Location -- File type (image, legal record, flight log, email, video, etc.) -- File Status (Redacted bool, Missing bool, Flagged bool
  3. Infill any MISSING records where possible.
  4. Extract images away from .pdf format, Breakout the "Multi-File" pdfs, renaming images/docs by file number. (I made a quick script that does this reliably well.)
  5. Determine which files were left as CSAM and "redact" them ourselves, removing any liability on our part.

What's Next: Once we have the Archive and Archive Directory. We can begin safely and confidently walking through the Directory as a group effort and fill in as many files/blanks as possible.

  1. Identify and dedact all documents with garbage redactions, (remember the copy/paste DOJ blunders from December) & Identify poorly positioned redaction bars to uncover obfuscated names
  2. LABELING! If we could start adding labels to each document in the form of tags that contain individuals, emails, locations, businesses - This would make it MUCH easier for people to "connect the dots"
  3. Event Timeline... This will be hard, but if we can apply a timeline ID to each document, we can put the archive in order of events
  4. Create some method for visualizing the timeline, searching, or making connection with labels.

We may not be detectives, legislators, or law men, but we are sleuth nerds, and the best thing we can do is get this data in a place that can allow others to push for justice and put an end to this crap once and for all. Its lofty, I know, but enough is enough. ...Thoughts?

[–] jandrew13@lemmy.world 2 points 1 day ago (1 children)

I'm not sure of the exact files that were reported by the NYT, but there certainly were some concerning images in the initial Jan 30 release, however it was certainly more than the reported 40. I saw others as well but I don't remember what the file numbers we're.

spoiler[246249_247010]

From my own observation timeline on the images in question: Jan 30: Images were accessible through DOJ directly. File numbers wereskipped in the list, but were manually reachable through URL. All these photos were fully unredacted (uncensored). **Feb 1: ** Images were NOT accessible through DOJ anymore, returns "Page not found". However images were (and still are) snapshotted via web.archive.org. Feb 2: Downloading the 87GB Set 9 appeared contain these images as well, meaning we likely all have them on our computers. yikes

These files were scrubbed from the DOJ website, along with many others.

I found many of the scrubbed files by parsing through the lists and finding large gaps in file numbers, where the preceding file did not contain multiple images/documents in one pdf. There are also tons of internal memos in the dataset that precede file groups and talk about the content ahead. These memos surrounded files that seemed like they were meant to be redacted, so its worth poking around. I didn't go nuts, but things I found around these that interesting and were also removed:

  • [EFTA00276493]: internal memo referring to Clinton photographed with "nude Gretchen".
  • [EFTA00273790-EFTA276487]: (removed) looks like arial Lidar scans of the full estate?
  • [EFTA00276220]: (removed) panoramic Infrared / xray-ray scan of a room
[–] jandrew13@lemmy.world 2 points 1 day ago (1 children)

Hey that makes sense to me man.

I think there will be plenty of falling chips in the coming weeks. Once the data is aggregated and truly accessible searchable.. someone is going to make some AI something that can connect the dots faster than the justice system - because my god is it slow as molasses.

I'm so tired of waiting around.

[–] jandrew13@lemmy.world 3 points 1 day ago

This seems like a valid plan - although I'm not that confident in the 'purge'. It might be good to redact those images ourselves and then nobody is pressed to store them. Better to have a confidently safe dataset that can be passed around safely.

Also, It looks like they went back and repaired the shitty text redactions on docs that were released late 2025 from what I can tell. I ran a script that auto detects and removes "fake" redactions and its not getting any hits anymore. even on files that it flagged in the past. They are definitely trying to cover their tracts* by the day*

view more: next ›