this post was submitted on 29 Dec 2025
22 points (100.0% liked)
Technology
1332 readers
3 users here now
A tech news sub for communists
founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Love these developments. Can't wait to be able to run a decent LLM on lower consumer hardware, or even on a decent laptop without GPU.
From the research I've seen published already, I'm pretty confident that it should be possible to get the same quality with smaller models like 32bln params or maybe even less that we see with full blown 600+bln param models. It seems like a lot of it come down to tracking context out of band so it doesn't need to live in memory, and people have come up with a number of approaches for doing that. We'll probably see more work done on MoE approach as well to efficiently load parts of the model that are actually relevant to the task being worked on. It's also possible we'll see novel approaches like this that could significantly reduce the memory requirements.