this post was submitted on 17 Mar 2026
660 points (98.1% liked)

Programming

26127 readers
556 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 2 years ago
MODERATORS
 

Excerpt:

"Even within the coding, it's not working well," said Smiley. "I'll give you an example. Code can look right and pass the unit tests and still be wrong. The way you measure that is typically in benchmark tests. So a lot of these companies haven't engaged in a proper feedback loop to see what the impact of AI coding is on the outcomes they care about. Lines of code, number of [pull requests], these are liabilities. These are not measures of engineering excellence."

Measures of engineering excellence, said Smiley, include metrics like deployment frequency, lead time to production, change failure rate, mean time to restore, and incident severity. And we need a new set of metrics, he insists, to measure how AI affects engineering performance.

"We don't know what those are yet," he said.

One metric that might be helpful, he said, is measuring tokens burned to get to an approved pull request – a formally accepted change in software. That's the kind of thing that needs to be assessed to determine whether AI helps an organization's engineering practice.

To underscore the consequences of not having that kind of data, Smiley pointed to a recent attempt to rewrite SQLite in Rust using AI.

"It passed all the unit tests, the shape of the code looks right," he said. It's 3.7x more lines of code that performs 2,000 times worse than the actual SQLite. Two thousand times worse for a database is a non-viable product. It's a dumpster fire. Throw it away. All that money you spent on it is worthless."

All the optimism about using AI for coding, Smiley argues, comes from measuring the wrong things.

"Coding works if you measure lines of code and pull requests," he said. "Coding does not work if you measure quality and team performance. There's no evidence to suggest that that's moving in a positive direction."

you are viewing a single comment's thread
view the rest of the comments
[–] olafurp@lemmy.world -4 points 11 hours ago (4 children)

I got a hot take on this. People are treating AI as a fire and forget tool when they really should be treating it like a junior dev.

Now here's what I think, it's a force multiplier. Let's assume each dev has a profile of...

2x feature progress, 2x tech debt removed 1x tech debt added.

Net tech debt adjusted productivity at 3x

Multiply by AI for 2 you have a 6x engineer

Now for another case, but a common one 1x feature, net tech debt -1.5x = -0.5x comes out as -1x engineer.

The latter engineer will be as fast as the prior in cranking out features without AI but will make the code base worse way faster.

Now imagine that the latter engineer really leans into AI and gets really good at cranking out features, gets commended for it and continues. He'll end up just creating bad code at an alarming pace until the code becomes brittle and unweildy. This is what I'm guessing is going to happen over the next years. More experienced devs will see a massive benefit but more junior devs will need to be reined in a lot.

Going forward architecture and isolation of concerns will be come more important so we can throw away garbage and rewrite it way faster.

[–] Buddahriffic@lemmy.world 5 points 7 hours ago

It's not even a junior dev. It might "understand" a wider and deeper set of things than a junior dev does, but at least junior devs might have a sense of coherency to everything they build.

I use gen AI at work (because they want me to) and holy shit is it "deceptive". In quotes because it has no intent at all, but it is just good enough to make it seem like it mostly did what was asked, but you look closer and you'll see it isn't following any kind of paradigms, it's still just predicting text.

The amount of context it can include in those predictions is impressive, don't get me wrong, but it has zero actual problem solving capability. What it appears to "solve" is just pattern matching the current problem to a previous one. Same thing with analysis, brainstorming, whatever activity can be labelled as "intelligent".

Hallucinations are just cases where it matches a pattern that isn't based on truth (either mispredicting or predicting a lie). But also goes the other way where it misses patterns that are there, which is horrible for programming if you care at all about efficiency and accuracy.

It'll do things like write a great helper function that it uses once but never again, maybe even writing a second copy of it the next time it would use it. Or forgetting instructions (in a context window of 200k, a few lines can easily get drowned out).

Code quality is going to suffer as AI gets adopted more and more. And I believe the problem is fundamental to the way LLMs work. The LLM-based patches I've seen so far aren't going to fix it.

Also, as much as it's nice to not have to write a whole lot of code, my software dev skills aren't being used very well. It's like I'm babysitting an expert programmer with alzheimer's but thinks they are still at their prime and don't realize they've forgotten what they did 5 minutes ago, but my company pays them big money and get upset if we don't use his expertise and probably intend to use my AI chat logs to train my replacement because everything I know can be parsed out of those conversations.

[–] GiorgioPerlasca@lemmy.ml 3 points 7 hours ago

Junior software developers understand the task. They improve their skill in understanding the code and writing better code. They can read the documentation.

Large language models just generate code based on what it looked like in previous examples.

[–] forrgott@lemmy.zip 7 points 11 hours ago

Or maybe don't try and drive a screw in with a hammer?

It's just not good for 99% of the shit it's marketed for. Sorry.

[–] TheReturnOfPEB@reddthat.com 6 points 11 hours ago

WALL OF TEXT that says inadvertently that junior devs should be treated like machines not people.