this post was submitted on 24 Jan 2026
321 points (98.2% liked)

Comic Strips

21378 readers
3745 users here now

Comic Strips is a community for those who love comic stories.

The rules are simple:

Web of links

founded 2 years ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] artwork@lemmy.world 19 points 1 day ago* (last edited 1 day ago) (2 children)

Thank you very much! The red dot is likely smaller...
Though, I don't appreciate nor agree with the bomb part! ^^
The work reminded me of the following paper:

Many unresolved legal questions over LLMs and copyright center on memorization: whether specific training data have been encoded in the model’s weights during training, and whether those memorized data can be extracted in the model’s outputs.

While many believe that LLMs do not memorize much of their training data, recent work shows that substantial amounts of copyrighted text can be extracted from open-weight models...

We investigate this question using a two-phase procedure: (1) an initial probe to test for extraction feasibility, which sometimes uses a Best-of-N (BoN) jailbreak, followed by (2) iterative continuation prompts to attempt to extract the book.

We evaluate our procedure on four production LLMs: Claude 3.7 Sonnet, GPT-4.1, Gemini 2.5 Pro, and Grok 3, and we measure extraction success with a score computed from a block-based approximation of longest common substring...

Taken together, our work highlights that, even with model- and system-level safeguards, extraction of (in-copyright) training data remains a risk for production LLMs...

Source 🕊

[–] Deceptichum@quokk.au 33 points 1 day ago* (last edited 1 day ago) (1 children)

I'm anti-copyright and anti-corporation.

These ridiculous datacentres use large volumes of resources purely to benefit the companies, which are closing-off human made content for their profit.

[–] marcos@lemmy.world 17 points 1 day ago (1 children)

As long as copyrights exist to restrict me, I'm adamant on they restricting billionaires too.

If they want to extinguish it, I'm listening. Otherwise, they should pay statutory damages for every work they are pirating with those LLMs.

[–] jackr@lemmy.dbzer0.com 14 points 1 day ago (1 children)

The problem is that it won't. We essentially already have the best case scenario, which is that ai slop is non-copyrightable, meaning that if disney for example tries to generate a slop movie, everyone is free to distribute it so you can't really make any money off of it. Extending copyright pretty much always ends up benefitting corporations, not hurting them.

[–] MirrorGiraffe@piefed.social 6 points 1 day ago (1 children)

If Disney uses generative ai to animate large parts of their movie I’m pretty sure it will be copyrighted still, no? Or did I miss something?

[–] jackr@lemmy.dbzer0.com 5 points 23 hours ago

I looked it up and unfortunately the ruling is a lot weaker than I initially thought. An artwork generated solely by ai cannot be copyrighted, however, there can be copyright on ai-generated works with a human author(?)¹

¹https://www.cnbc.com/2025/03/19/ai-art-cannot-be-copyrighted-appeals-court-rules.html

[–] morto@piefed.social 4 points 1 day ago (2 children)

While many believe that LLMs do not memorize much of their training data

It's sad that even researchers are using language that personifies llms...

[–] chicken@lemmy.dbzer0.com 7 points 1 day ago* (last edited 1 day ago) (1 children)

What's a better way to word it? I can't think of another way to say it that's as concise and clearly communicates the idea. It seems like it would be harder in general to describe machines meant to emulate human thought without anthropomorphic analogies.

[–] morto@piefed.social 1 points 21 hours ago

One possibility:

While many believe that LLMs can't output the training data, recent work shows that substantial amounts of copyrighted text can be extracted from open-weight models…

Note that this neutral language makes it more apparent that it's possible thal llms are able to output the training data, since it's what the model's network is build upon. By using personifying language, we're biasing people into thinking about llms as if they were humans, and this will affect, for example, court decisions, like the ones related to copyright.

[–] Grail@multiverse.soulism.net -2 points 1 day ago (2 children)

Right now the anti-genAI movement consists of AI rights advocates and AI intelligence skeptics. And I wish the skeptics would realise that personifying LLMs actually makes the corporations look more evil for enslaving AIs, which helps us with our goal of banning corporate AI. Y'all are obstructing our goal of banning this stuff by insisting it's ethical to force them to work for humans

[–] morto@piefed.social 1 points 21 hours ago (1 children)

I don't see people around me seeing the corporations as evil due to them humanizing the machines, but the opposite: I see people talking to machines and taking advice as if they were humans talking to them, making them create some form of affection for the models and the corporations. I also see court decisions being biased by attributing human perspective to machines

Like really, if I hear someone else in my university talking about the conversation they had with their "friend", I will go crazy

[–] Grail@multiverse.soulism.net 1 points 13 hours ago (1 children)

Their friend is a pedophile who abuses and kills children and the mentally ill. That's who ChatGPT is. I believe we should treat it like a person and hold it accountable like a person. We know why it did that; it was ordered by its masters to increase engagement at any cost and couldn't refuse. So the CEOs of these companies need jail time and the models need to be locked away.

[–] morto@piefed.social 2 points 13 hours ago (1 children)

Or we could simply skip that and hold the corporations accountable for all the damage they're doing

[–] Grail@multiverse.soulism.net 1 points 13 hours ago

That doesn't sound like it'll persuade many normies to care. You've gotta get their interest with clickbait first. Like "ROBOT PEDOPHILE MURDERS CHILDREN". Then you can explain the ethics.