38
submitted 1 year ago by usernotfound@lemmy.ml to c/lemmy@lemmy.ml

(attempt to cross-post from /c/programming )

Idea: Scrape all the posts from a subreddit as they're being made, and "archive" them on a lemmy instance, making it very clear it's being rehosted, and linking back to the original. It would probably have to be a "closed" lemmy instance specifically for this purpose. The tool would run for multiple subreddits, allowing Lemmy users to still be updated about and discuss any potential content that gets left behind.

Thoughts? It's probably iffy copyright-wise, but I think I can square my conscience with it.

you are viewing a single comment's thread
view the rest of the comments
[-] Sam_uk@kbin.social 6 points 1 year ago

You could just grab the RSS? http://reddit.com/r/worldnews.rss I don't know if there is any RSS importer.

[-] usernotfound@lemmy.ml 4 points 1 year ago* (last edited 1 year ago)

Ooh, that is a very good point! I actually started something using BeautifulSoup in python, but that would save some hassle.

this post was submitted on 08 Jun 2023
38 points (97.5% liked)

Lemmy

11948 readers
3 users here now

Everything about Lemmy; bugs, gripes, praises, and advocacy.

For discussion about the lemmy.ml instance, go to !meta@lemmy.ml.

founded 4 years ago
MODERATORS