this post was submitted on 16 Jun 2026
14 points (100.0% liked)

Technology

42748 readers
154 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 7 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] Jayjader@jlai.lu 4 points 8 hours ago (1 children)

I've been pleasantly surprised by Qwen3.6-27b on a Radeon 6700xt (12GB of VRAM) with 32GB of system RAM for it to offload onto (especially when pushing the context window up past 50k). Definitely more of a "compose prompt and hit send -> do something else -> check back after a while to view results" experience than an engaged back-and-forth, but at least compared to previous models I've tried running over the past year or two the results are palatable and sometimes even meaningfully useful.

Given the speed I get, I've mostly found it useful for doing overviews of a codebase southy some sort of improvement plan suggested at the end. Tool calls work, but I'm still not comfortable letting it code outright (plus, I think I can still code faster than it for now).

[–] yogthos@lemmy.ml 3 points 8 hours ago (1 children)

I find I kind of look at the whole agentic harness setup as a genetic algorithm. Your tests and specs are the fitness function for the program you’re evolving, and the LLM is the mutator. At each step it generates some output, it gets tested against the fitness function, the LLM gets feedback and iterates on it. Eventually something working falls out in the end. The better you can define the selection criteria the more you box the agent in the better results you get.

The trick I can recommend for getting the model to code is to ask it to come up with a phased plan composed of focused features, and then to build each feature on its own branch. That way you have a clear unit of work that does a specific thing which makes it much easier to review the code. Can also recommend tools like https://github.com/Fission-AI/OpenSpec for making specs to box the model in when it works.

[–] Jayjader@jlai.lu 1 points 6 hours ago (1 children)

I really dislike the idea of making the whole program a genetic algorithm - that approach is nice when you don't have a straightforward approach to employ/enact, but otherwise it feels both overkill and horrendously inefficient.

The next step for my own harness (whenever I get back to working on it) is definitely to look at leveraging structured outputs to help these smaller models iterate towards a longer term goal.

[–] yogthos@lemmy.ml 2 points 6 hours ago* (last edited 6 hours ago)

I don't mean you turn the program itself into a genetic algorithm. I'm saying that the agentic loop for producing code acts as one. The code itself is just regular code. And the loop isn't really any more inefficient than what you do as a developer. It almost never happens that you write perfect code on a first try in practice. You'll write some code, run your tests, look how it did, and iterate. That's precisely the same process the agent follows.

The difference from a typical genetic algorithm is that the LLM is not just randomly generating text that eventually fits into the shape you specified. It's generating code that's already close to what's intended most of the time, and it just needs a bit of massaging to get completely right. That's the feedback loop here.