this post was submitted on 26 Jun 2026
143 points (91.8% liked)

Technology

85775 readers
3729 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] duckCityComplex@lemmy.world 6 points 23 hours ago (1 children)

The article is not clear on what a "distillation attack" is... what exactly is Alibaba supposed to be getting away with here? The article mentions using many different connections through obfuscation networks and proxies... so that would get them around rate limiting, and maybe enable them to submit many queries on free accounts... just spin up a new account whenever you hit the token limit of an unpaid account. So basically it's a terms of service violation?

I don't see why it's necessarily a huge leg up for a competitor... they are just using the outputs of another model as training data. They still need to train their model, which is the expensive and energy intensive part.

It sounds to me like Anthropic just wants the US Government to help enforce its TOS internationally and force Alibaba to pay for those precious tokens? Because apart from that piece, the "attack" just seems like normal use of the service. If Anthropic's service has an inherent vulnerability, that's their problem.

Of course all the other comments about how they stole all their training data in the first place are spot on.

[–] iocase@lemmy.zip 4 points 12 hours ago* (last edited 12 hours ago) (1 children)

Distillation allows you to make a smaller model that can produce the same outputs as a larger model. Basically they're pirating all of the hard work anthropic did pirating the entire internet.

Alibaba gets a model that produces basically the same output for a tiny fraction of the cost to operate the model once it's finished training. Distillation training also uses basically all of its data from the big model (afaik it's all of it sourced from the parent model)

It's like if you took a lump of metal and showed it Porsche 911s until it turned into a 911 shaped chunk of metal that had 95% of the performance, but it only cost you $3000 for the ingot, and also cost ⅕ the amount in fuel and maintenance.

[–] duckCityComplex@lemmy.world 2 points 10 hours ago

Ok, thanks for the detailed explanation. I guess if your goal is to make your model sound like another model that makes perfect sense.