this post was submitted on 31 Mar 2026
452 points (99.8% liked)
Technology
83304 readers
3381 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related news or articles.
- Be excellent to each other!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
- Check for duplicates before posting, duplicates may be removed
- Accounts 7 days and younger will have their posts automatically removed.
Approved Bots
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
It doesn't fix it, but as stupid as it looks, it should actually improve the chances.
If you've seen how the reasoning works, they basically spit out some garbage, then read it again and think whether it's garbage enough or not.
They do try to 'correct their errors', so to say.
It will slightly improve the chances. But, is that enough?
Imagine you had an intern working with you on a project. They didn't know anything about SQL injection, cross site scripting, etc. You probably wouldn't give them a task where that was a concern. If you did, you'd watch them like a hawk. Because they're an intern, the amount of code they'd produce would probably be pretty low, and it would be pretty low-quality overall, so it would be easy to spot mistakes that would lead to these kinds of vulnerabilities.
An LLM has the understanding of the problem space that an intern does, but produces vast amounts of code extremely quickly. That code is designed to "blend in", i.e. it's specifically trained to look like good code, whether it is or not. Because of "vibe coding", people trust it to do all kinds of things, including implement bits where there's a danger of XSS or SQL injection. And the way Claude Code ensures it doesn't generate those vulnerabilities is... someone says "hey, don't do that, ok?"
Having that statement in there is better than not having it. But, it's just a reminder that these things aren't appropriate for writing production code. They don't actually understand what XSS or SQL injection are, and they can't learn. They don't know why it's important. They don't have a technique for checking if their code actually has those vulnerabilities, other than passing it to themselves recursively and asking that other version of themselves to generate some text that might flag if those vulnerabilities were spotted. But, AIs are famously sycophantic so even recursively using itself, it will generate text to "please" itself and probably write something like "your code is great and I can't spot any vulnerabilities at all! Congratulations! [Emoji] [Emoji] [Emoji]"
That’s not enabled by default afaik and it burns through way more tokens looping its output through several times. It also adds a bunch more context which will bring you that much closer to context collapse.
I didn't turn it on, and I see it doing it all the time. In my case though the mistakes are often absurd. I often feel like claude is a very junior programmer that has a hard time remembering the original requirements.
While true, the latest opus model has 1m token context. Which is a lot more than the previous 200k limit. Hard to fill that up with regular work, but easy if you try to oneshot a whole product.