this post was submitted on 15 Feb 2026
44 points (100.0% liked)

Technik

994 readers
34 users here now

die Community für alles, was man als Technik beschreiben kann


the community for everything you could describe as technology


Beiträge auf Deutsch oder Englisch


Posts in German or English

founded 2 years ago
MODERATORS
top 3 comments
sorted by: hot top controversial new old
[–] cerebralhawks@lemmy.dbzer0.com 10 points 4 days ago

Or could it have something to do with the fact that social media is using Archive links to get information to people through paywalls?

[–] rogsson@piefed.social 9 points 4 days ago

lool too late

“Common Crawl and Internet Archive are widely considered to be the ‘good guys’ and are used by ‘the bad guys’ like OpenAI,” said Michael Nelson, a computer scientist and professor at Old Dominion University. “In everyone’s aversion to not be controlled by LLMs, I think the good guys are collateral damage.”

Hope for revert once bubble pop?

“We believe in the value of The New York Times’s human-led journalism and always want to ensure that our IP is being accessed and used lawfully,” said a Times spokesperson. “We are blocking the Internet Archive’s bot from accessing the Times because the Wayback Machine provides unfettered access to Times content — including by AI companies — without authorization.”

Fuck them. Archive on archive.today instead. I hope these bastard get UTI.

Some Gannett sites have also taken stronger measures to guard their contents from Internet Archive crawlers. URL searches for the Des Moines Register in the Wayback Machine return a message that says, “Sorry. This URL has been excluded from the Wayback Machine.”

“USA Today Co. has consistently emphasized the importance of safeguarding our content and intellectual property,” a company spokesperson said via email. “Last year, we introduced new protocols to deter unauthorized data collection and scraping, redirecting such activity to a designated page outlining our licensing requirements.”

Luigi, your job.