Technology

83304 readers

3381 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

452

Claude Code's source code appears to have leaked: here's what we know (venturebeat.com)

submitted 2 days ago by return2ozma@lemmy.world to c/technology@lemmy.world

68 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] filcuk@lemmy.zip 34 points 1 day ago (2 children)

It doesn't fix it, but as stupid as it looks, it should actually improve the chances.
If you've seen how the reasoning works, they basically spit out some garbage, then read it again and think whether it's garbage enough or not.
They do try to 'correct their errors', so to say.

[–] merc@sh.itjust.works 2 points 22 hours ago

It will slightly improve the chances. But, is that enough?

Imagine you had an intern working with you on a project. They didn't know anything about SQL injection, cross site scripting, etc. You probably wouldn't give them a task where that was a concern. If you did, you'd watch them like a hawk. Because they're an intern, the amount of code they'd produce would probably be pretty low, and it would be pretty low-quality overall, so it would be easy to spot mistakes that would lead to these kinds of vulnerabilities.

An LLM has the understanding of the problem space that an intern does, but produces vast amounts of code extremely quickly. That code is designed to "blend in", i.e. it's specifically trained to look like good code, whether it is or not. Because of "vibe coding", people trust it to do all kinds of things, including implement bits where there's a danger of XSS or SQL injection. And the way Claude Code ensures it doesn't generate those vulnerabilities is... someone says "hey, don't do that, ok?"

Having that statement in there is better than not having it. But, it's just a reminder that these things aren't appropriate for writing production code. They don't actually understand what XSS or SQL injection are, and they can't learn. They don't know why it's important. They don't have a technique for checking if their code actually has those vulnerabilities, other than passing it to themselves recursively and asking that other version of themselves to generate some text that might flag if those vulnerabilities were spotted. But, AIs are famously sycophantic so even recursively using itself, it will generate text to "please" itself and probably write something like "your code is great and I can't spot any vulnerabilities at all! Congratulations! [Emoji] [Emoji] [Emoji]"

[–] underisk@lemmy.ml 11 points 1 day ago (2 children)

That’s not enabled by default afaik and it burns through way more tokens looping its output through several times. It also adds a bunch more context which will bring you that much closer to context collapse.

[–] Modern_medicine_isnt@lemmy.world 6 points 1 day ago

I didn't turn it on, and I see it doing it all the time. In my case though the mistakes are often absurd. I often feel like claude is a very junior programmer that has a hard time remembering the original requirements.

[–] fuzzzerd@programming.dev 6 points 1 day ago

While true, the latest opus model has 1m token context. Which is a lot more than the previous 200k limit. Hard to fill that up with regular work, but easy if you try to oneshot a whole product.