this post was submitted on 26 Jun 2026
135 points (91.4% liked)

Technology

85775 readers
3660 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] iocase@lemmy.zip 2 points 3 hours ago (1 children)

LLMs are trained by taking a passage of text and masking out the next words. The LLM has to guess what the next word is going to be.

If you use the output of a fancy ass billion dollar model as your training data, you can duplicate the output style and "knowledge" of the parent model if you show it enough responses. That's basically what Alibaba did. They prompted the shit out of Claude and used the responses to train their own model which allows you to piggyback off of Claude's hard work pirating the entire internet. Your cloned model can also be smaller and leaner, being cheaper to operate.

I said this elsewhere but it's like taking a block of metal and showing it Porsche 911s until it turned into a Porsche 911 with 95% of the performance, and it also costs ⅕ the cost to maintain and fuel it.

[–] GreenKnight23@lemmy.world 1 points 48 minutes ago

here's the thing with the Porsche analogy. you had to buy or rent the Porsche first. paid for it and used it exactly within the TOS outlined in the contract. no law was broken.

what Anthropic is arguing is that anything their model comes up with remains Anthropic IP. this means they will literally need to sue every single one of their customers first, before they even have a snowballs chance in hell of pursuing Alibaba.

they already set the precedent by not legally pursuing their customers that use paid content generated by their model, and it automatically becomes the property of the end user.