this post was submitted on 03 Jun 2026
815 points (99.6% liked)

People Twitter

10036 readers
379 users here now

People tweeting stuff. We allow tweets from anyone.

RULES:

  1. Mark NSFW content.
  2. No doxxing people.
  3. Must be a pic of the tweet or similar. No direct links to the tweet.
  4. No bullying or international politcs
  5. Be excellent to each other.
  6. Provide an archived link to the tweet (or similar) being shown if it's a major figure or a politician. Archive.is the best way.

founded 2 years ago
MODERATORS
815
Managers (media.piefed.zip)
submitted 2 days ago* (last edited 2 days ago) by inari@piefed.zip to c/whitepeopletwitter@sh.itjust.works
 
you are viewing a single comment's thread
view the rest of the comments
[–] Kazumara@discuss.tchncs.de 1 points 1 day ago (1 children)

Inference is cheap and efficient.

Tell that to all the Github users that are screaming about the new token based billing. In reality inference on these massive models with big context windows is expensive, but was subsidized so hard, that nobody has an accurate feeling for the cost.

[–] theunknownmuncher@lemmy.world 1 points 1 day ago (1 children)

No, it is cheap and efficient. It is relative, and the comparison is to model training. But yeah, its not free

[–] Kazumara@discuss.tchncs.de 1 points 1 day ago* (last edited 22 hours ago)

Sure it's much much cheaper than training, but importantly those companies are not recouping anything with inference because it is still more expensive than what they are selling it for.

They are double bankrupting themselves.

At work we run inference for a research project with an open weights model in the public cloud another part of my company provides and we pay around 25$ a day for a VM with a single L40s. It's both slow - despite not even serving concurrent users - and kind of bad in its outputs.

Edit: Interference -> Inference, arguing on the internet after waking up first thing in the morning might not have been the best idea