this post was submitted on 07 Dec 2025
1003 points (98.8% liked)
AntiTrumpAlliance
1269 readers
517 users here now
About
An alliance among all who oppose Donald Trump's actions, positions, cabinet, supporters, policies, or motives. This alliance includes anyone from the left or the right; anyone from any religion or lack thereof; anyone from any country or state; any man, woman or child.
Rules
-No pro-Trump posts or comments
-No off topic posts
-Be civil
-No trolling
-Follow Lemmy terms of service
Social Media
Other Communities
!desantisthreatensusa@lemmy.world

founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
LLM slop factories are overtly racist because they're trained on shit lifted straight off the internet.
That's image generation, not LLM (language/text generation), but the point stands
Hate to bring it to you, but today's image generation comes through LLMs
(Multimodal) GPT ≠ "pure" LLM. GPT-4o uses an LLM for the language parts, as well as having voice processing and generation built-in, but it uses a technically distinct (though well-integrated) model called "GPT Image 1" for generating images.
You can't really train or treat image generation with the same approach as natural language, given it isn't natural language. A binary string doesn't adhere to the same patterns as human speech.
Just curious, does the LLM generate a text prompt for the image model, or is there a deeper integration at the embedding level/something else?
According to CometAPI:
I haven't found any other sources to back that up, because most platforms seem more concerned with how to access it than how it works under the hood.
You're right that image generation models are not LLMs, but they actually are pretty closely related. You may already know how they work, but for those that don't, it's kind of interesting. It uses a similar pipeline for vectorization of input, but takes a different approach for output.