minimax m2.5 works way better than chatgpt for coding and it's half the price
Technology
A tech news sub for communists
Also, I'd argue that you don't actually need huge models for coding. The problem is with the way we structure code today, which is not conducive towards LLMs. Even small models that you can run locally are quite competent writing small chunks of code, say 50~100 lines or so. And any large application can be broken up into smaller isolated components.
The way I look at it is that we can view applications as state machines. For any workflow, you can draw out a state chart where you have nodes that do some computation, and then the state transitions to another node in the graph. The problem with traditional coding style is that we implicitly bake this graph into function calls. You have a piece of code that does some logic, like authenticating a user, and then it decides what code should run after that. And that creates coupling, cause now you have to trace through code to figure out what the data flow actually is. This is difficult for agents to do because it causes context to quickly grow in unbounded way, leading to context rot. When an LLM has too much data in its context, it doesn't really know what's important and what to focus on, so it ends up going off the rails.
But now, let's imagine that we do inversion of control here. Instead of having each node in the state graph call each other, why not pull that logic out. We could pass a data structure around that each node gets as its input, it does some work, and then returns a new state. A separate conductor component manages the workflow and inspects the state and decides which edge of the graph to take.
The graph can be visually inspected, and it becomes easy for the human to tell what the business logic is doing. The graphs don't really have a lot of data in them either because they're declarative. They're decoupled from the actual implementation details that live in the logic of each node.
Going back to the user authentication example. The handler could get a parsed HTTP request, try to look up the user in the db, check if the session token is present, etc. Then update the state to add a user or set a flag stating that user wasn't found, or wasn't authenticated. Then the conductor can look at the result, and decide to either move on to the next step, or call the error handler.
Now we basically have a bunch of tiny programs that know nothing about one another, and the agent working on each one has a fixed context that doesn't grow in unbounded fashion. On top of that, we can have validation boundaries between each node, so the LLM can check that the component produces correct output, handles whatever side effects it needs to do correctly, and so on. Testing becomes much simpler too, cause now you don't need to load the whole app, you can just test each component to make sure it fills its contract correctly.
What's more is that each workflow can be treated as a node in a bigger workflow, so the whole thing becomes composable. And the nodes themselves are like reusable Lego blocks, since the context is passed in to them.
And this idea isn't new, workflow engines have been around for a long time. The reason they don't really catch on for general purpose programming is because it doesn't feel natural to code in that way. There's a lot of ceremony involved in creating these workflow definitions, writing contracts for them, and jumping between that and the implementation for the nodes. But the equation changes when we're dealing with LLMs, they have no problem doing tedious tasks like that, and all the ceremony helps keep them on track.
I would wager that moving towards this style programming would be a far more effective way to use these tools, and that current crops of LLMs is more than good enough for that.
It's amazing that we were taught about finite state automata and machines, yet for programming large applications, we haven't necessarily taken that same approach. As you mentioned, all of the coupling and created dependencies of function calls upon other function calls really gets impossibly difficult to debug and analyze as the complexity, functionality, and scale of an application grows. After reading what you wrote, it's like a new revelation; I'm eager to put this paradigm into practice. I've generally lost interest in many of my personal projects simply for the fact that they become so untenable as they've grown with dependency upon dependency. The notion of splitting up the code into their proper segments or nodes that do some work or a program of instructions, return their product to a central conductor via a state-tracking data structure is something I plan on using moving forward. Perhaps, I'll even revisit some of these abandoned projects and try to view the flow of execution under this lens and then restructure operations accordingly.
a guy i worked with at my last job is forming a company doing something akin to this.. he's been working on it for 6 months or so. he's a math wizard and believes he can get this to work in a way that can be mathematically proven but to be honest when he talks about it it all goes way over my head pretty quickly.
i hope you're both right because it would be great to not have to wrestle models to do things and if you can give a prompt for what you want done and the orchestrator can break it down into workable tasks and then pass those tasks out to agents to do, check, verify, in a way that is reliable it will be a game changer for sure.
the downside will be most of IT jobs will be gone pretty quickly
The following is intended as satire, not a real reported picture of events:
An inconsolable Sam Altman was found in his office, trying to teach a robot dog to put blocks in the right holes by offering it increasing sums of money each time it failed. With bags under his eyes, empty cans littering the walls, and a worn hoodie on his shoulders with the hood pulled up, the once bright-eyed and optimistic CEO, heralding the arrival of true intelligence any day now, was sharing some hard truths.
"It turns out the Chinese know how to build things," he says, while he pokes the robot dog with a stick and tries increasing the offered sum to $500 instead of $450. The robot makes a movement and tries to nudge a square block into a round hole, which frustrates the weary Altman. "This robot is useless. How is it ever going to help break strikes if it can't handle simple shapes?"
Sam Altman rose to fame, heading up a company known for transparency and openness called OpenAI. The company later reversed this decision when they realized it wasn't profitable enough or good for business. When asked about his origins, some light returns to Altman's increasingly deadened eyes, if only for a moment.
"Back then, it was about making waves. Now it's about crushing the competitive with those waves until their bones break. For legal purposes, that was metaphor." He prods the robot a few more times and a fanatical gleam rises in him. "Let's raise the stakes." He ups the offering to $5 billion. After all, he explains, it's not like the robot will actually get the money anyway. The robot dog goes for a round block this time. The atmosphere in the room is tense. We could be witnessing history.
Then the robot nudges the round block at the square hole. Altman lets out a long-suffering sigh and puts his head in his hands. "The Chinese have something we don't," he admits in a rare candid moment. What is the miracle factor they have on hand? Socialism? Money? Integrity? Altman shakes his head, "It's magic, it has to be, there's no other explanation. They're tapping into the cosmic energies that make up our world. And it's going to take the Grand Wizard to stop them."
When pressed about his choice of title and its historical connotations, he ends the interview and refocuses his attention on the robot dog, now upping the offer to $100 trillion. "What do I pay you for?!?" He is heard screaming as we retreat down the hallway, being escorted out by security.