this post was submitted on 08 Jan 2025
128 points (97.1% liked)

datahoarder

7063 readers
107 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 5 years ago
MODERATORS
 

cross-posted from: https://slrpnk.net/post/17044297

You don't understand, I might need that hilarious Cracked listicle from fifteen years ago!

you are viewing a single comment's thread
view the rest of the comments
[–] clb92@feddit.dk 4 points 4 weeks ago* (last edited 3 weeks ago)

I don't think it indexes the text content, but you could certainly set something up with an external application that indexes the archived pages and lets you search them. Did a quick search, and in one GitHub issue someone is talking about setting up Sonic Search for that purpose: https://github.com/ArchiveBox/ArchiveBox/issues/956#issuecomment-1320587158

EDIT: It seems Sonic is actually a search system developed specifically for ArchiveBox full text search. I'm gonna try it out too.

EDIT: Works great