this post was submitted on 01 Jul 2025
2152 points (98.4% liked)

Microblog Memes

8738 readers
3716 users here now

A place to share screenshots of Microblog posts, whether from Mastodon, tumblr, ~~Twitter~~ X, KBin, Threads or elsewhere.

Created as an evolution of White People Twitter and other tweet-capture subreddits.

Rules:

  1. Please put at least one word relevant to the post in the post title.
  2. Be nice.
  3. No advertising, brand promotion or guerilla marketing.
  4. Posters are encouraged to link to the toot or tweet etc in the description of posts.

Related communities:

founded 2 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] Jakeroxs@sh.itjust.works 0 points 1 month ago (16 children)

I do, because they're not at full load the entire time it's in use

[–] FooBarrington@lemmy.world 1 points 1 month ago (15 children)

They are, it'd be uneconomical not to use them fully the whole time. Look up how batching works.

[–] Jakeroxs@sh.itjust.works 1 points 1 month ago* (last edited 1 month ago) (14 children)

I mean I literally run a local LLM, while the model sits in memory it's really not using up a crazy amount of resources, I should hook up something to actually measure exactly how much it's pulling vs just looking at htop/atop and guesstimating based on load TBF.

Vs when I play a game and the fans start blaring and it heats up and you can clearly see the usage increasing across various metrics

[–] FooBarrington@lemmy.world 1 points 1 month ago (1 children)

My guy, we're not talking about just leaving a model loaded, we're talking about actual usage in a cloud setting with far more GPUs and users involved.

[–] Jakeroxs@sh.itjust.works 1 points 1 month ago (1 children)

So you think they're all at full load at all times? Does that seem reasonable to you?

[–] FooBarrington@lemmy.world 1 points 1 month ago (1 children)

Given that cloud providers are desperately trying to get more compute resources, but are limited by chip production - yes, of course? Why do you think they're trying to expand their resources while their existing resources aren't already limited?

[–] Jakeroxs@sh.itjust.works 1 points 1 month ago (1 children)

Because they want the majority of the new chips for training models, not running the existing ones would be my assertion. Two different use cases

[–] FooBarrington@lemmy.world 1 points 1 month ago (1 children)

Sure, and that's why many cloud providers - even ones that don't train their own models - are only slowly onboarding new customers onto bigger models. Sure. Makes total sense.

[–] Jakeroxs@sh.itjust.works 1 points 1 month ago (1 children)

I mean do you actually know or are you just assuming?

load more comments (12 replies)
load more comments (12 replies)
load more comments (12 replies)