this post was submitted on 17 Jul 2025
14 points (100.0% liked)

Technology

39681 readers
114 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago
MODERATORS
 

We all know that AI is expensive, but a new set of algorithms developed by researchers at the Weizmann Institute of Science, Intel Labs, and d-Matrix could significantly reduce the cost of serving up your favorite large language model (LLM) with just a few lines of code.

Presented at the International Conference on Machine Learning this week and detailed in this paper, the algorithms offer a new spin on speculative decoding that they say can boost token generation rates by as much as 2.8x while also eliminating the need for specialized draft models.

Speculative decoding, if you're not familiar, isn't a new concept. It works by using a small "draft" model ("drafter" for short) to predict the outputs of larger, slower, but higher quality "target" models.

If the draft model can successfully predict, say, the next four tokens in the sequence, that's four tokens the bigger model doesn't have to generate, and so we get a speed-up. If it's wrong, the larger model discards the draft tokens and generates new ones itself. That last bit is important as it means the entire process is lossless — there's no trade-off in quality required to get that speed-up.

top 2 comments
sorted by: hot top controversial new old
[–] Hirom@beehaw.org 7 points 1 week ago (1 children)
[–] Quexotic@beehaw.org 1 points 4 days ago