I hate to say it, but there's a lot of "vibe coders" that use AI to write their code, then they (or someone else) use AI to review it. No human brains involved.
dan
The article says:
None of the tools produced exploitable SQL injection or cross-site scripting
but I've seen exactly this. After years of not seeing any SQL injection vulnerabilities (due to the large increase in ORM usage plus the fact that pretty much every query library supports/uses prepared statements now), I caught one while reviewing vibe-coded code ~~written~~ generated by someone else.
It wasn't a dox attempt though. The blog just collected information that was already publicly available on other sites.
In this case, their CAPTCHA page intentionally included code to DoS a particular blog, sending a request to search for a random string every 300ms (search is very CPU-intensive). This was regardless of the archived site you were trying to view.
This is understandable, but at the same time, none of the anti-paywall lists are as good as archive.today. They actually have paid accounts at a bunch of paywalled sites, and use them when scraping.
Sketchy looking site (seems entirely AI generated) but interesting article regardless.
Why not use a provider like AirVPN that lets you use the same port number all the time?
I understand now. I completely missed the point.
It works well because they use paid accounts to scrape a bunch of paywalled sites, which is why publishers are trying to figure out who runs it.
It's completely untrustworthy now that they've shown that they can (and do) edit archived pages.
Why do you need an archive of Wikipedia though? Each page retains its entire history, so you can easily go back to old versions without using a third-party site (especially one that DDoSes people)
Wikimedia also provide downloads of the whole of Wikipedia, including page history. You can easily have your own copy of the entirety of Wikipedia if you want to, as long as you've got enough disk space and patience to download it.
Edit: I'm an idiot but I'm leaving this comment here. I didn't realise you meant dead links on Wikipedia, not to Wikipedia.
the client side can be as fast or faster than the ‘server’ side.
That's not the case on a lot of JS-heavy sites, though. A lot of logic runs on the main thread, which slows things down. The only way to run things off the main thread is by using web workers, but that adds extra serialization/deserialization overhead.
That also has the potential to create securiity concerns at both ends
Generally, the more logic you have on the client-side, the more likely you are to have security issues. The client is a completely untrusted environment (since they can do whatever they want with your JS code), and increasing the amount of logic on the client side increases your attack surface there.


Haha good point - maybe "generated by" is a better description?