this post was submitted on 29 Jan 2024

32 points (100.0% liked)

Technology

39466 readers

231 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago

MODERATORS

alyaza@beehaw.org

TheRtRevKaiser@beehaw.org

gyrfalcon@beehaw.org

rs5th@beehaw.org

coldredlight@beehaw.org

SemioticStandard@beehaw.org

TheRtRevKaiser@kbin.social

remington@beehaw.org

Amazon- and Google-backed AI firm Anthropic says “general-purpose AI tools simply could not exist” if AI companies had to pay licences for the training material (www.computerweekly.com)

submitted 1 year ago by 0x815@feddit.de to c/technology@beehaw.org

70 comments fedilink hide all child comments

Generative artificial intelligence (GenAI) company Anthropic has claimed to a US court that using copyrighted content in large language model (LLM) training data counts as “fair use”, however.

Under US law, “fair use” permits the limited use of copyrighted material without permission, for purposes such as criticism, news reporting, teaching, and research.

In October 2023, a host of music publishers including Concord, Universal Music Group and ABKCO initiated legal action against the Amazon- and Google-backed generative AI firm Anthropic, demanding potentially millions in damages for the allegedly “systematic and widespread infringement of their copyrighted song lyrics”.

top 50 comments

sorted by: hot top controversial new old

[–] Floon@lemmy.ml 27 points 1 year ago (2 children)

You don't get to both ignore intellectual property rights of others, and enforce them for yourself. Fuck these guys.

[–] el_bhm@lemm.ee 8 points 1 year ago* (last edited 1 year ago)

I guess people are finally catching up to the big con with LLMs should not be copyrighted ampliganda. It is astroturfing at its best.

The end goal is controlling rights to what corporations produce with LLMs without spending a dime. All the while cutting jobs.

Writing was in CAPITAL LETTERS on the walls for the past two years. Why did twitter restrict API access? Why did Reddit restrict API access? Why did Github/Bitbucket/Gitlab restricted web ui functions for unlogged?

They knew and wallgardened the user generated data.

Cmon people.

And the hypocrisy of this all. If it is bad, it is user data, if we can mine nuh ah bitch, ours.

Also, for people arguing for free use of anything to build LLMs. Regulations will come. Once big players control enough of the LLM market.

[–] Moira_Mayhem@beehaw.org 7 points 1 year ago (3 children)

Serious Question: When an artist learns to draw by looking at the drawings of the masters, and practicing the techniques they pioneered, are the art students respecting the intellectual property rights of those masters?

Are not all of that student's work derivative of an education based on other people's work who will never see compensation for that student's use?

[–] chahk@beehaw.org 8 points 1 year ago* (last edited 1 year ago) (1 children)

I agree with you on principle. However... How long do you think it will be until these very same "AI" companies copyright and patent every piece of content their algorithms spew out? Will they abide by the same carve-outs they want for themselves right now? Somehow I doubt it.

They want to ignore the laws for themselves, but enforce them onto everyone else. This "Rules for thee but not for me" bullshit can't be allowed to pass. Let's then abolish all copyright, and we'll see how long these companies last when everyone can just grab their stuff "for learning".

[–] Moira_Mayhem@beehaw.org 2 points 1 year ago (1 children)

How long before a self-owned AI company that does every administrative job better than humans because it trained on human behavior for 100 years?

What do you think an entity like that would be capable of?

[–] chahk@beehaw.org 4 points 1 year ago (1 children)

A bit off-topic, but I'd be fine with that. The more mind-numbingly dumb work that computers can do for us, the less time we have to spend doing it ourselves. Administrative jobs holders disagree with this, but so did every person whose job and livelihood was replaced by automation, ever. UBI (universal basic income) is the only answer that will save all of us from starvation when automation eventually replaces us too.

load more comments (1 replies)

[–] Floon@lemmy.ml 6 points 1 year ago

One, let's accept that there is a public domain, and cribbing freely from the public domain is A-OK. I can reproduce Michaelangelo all I want, and it's all good. AI can crib from that all it wants.

AI can't invent. People can invent: i can have a wholly new idea that no one has ever had. AI does nothing but recombine other existing ideas. It must have seed data, and it won't create anything for which it has no initial input: feed it photographs only, and it can't create a pencil drawing image. Feed it only black and white images, and it can't create color images.

People do not require cribbing from sources. Give a toddler supplies, and they will create. So, we have established that there is a fundamental difference between the creation process. One is dependent on previous work, and one is not.

Now, with influences, you can ask, is your new creation dependent on the previous creation directly? If it is so utterly dependent on the prior work, such that your work could not possibly exist without that specific prior art, you might get sued. It will get debated and society's best approximation of a collective rational mind will determine if you copied or if you created something new that was merely inspired by prior art.

AI can only create by the direct existence of prior art. It fakes invention. Its work has to come from somewhere else.

People have shown how dependent it is on its sources with prompts that say things like, "portrait of a patriotic soldier superhero" and it comes back with a goddamned portrait of Chris Evans. The prompt did not include his name, or Captain or America, and it comes back with an MCU movie poster. AI does not create. People create.

[–] DdCno1@beehaw.org 4 points 1 year ago (1 children)

I think there is a fundamental difference here. People are not corporations. People have always learned like this and will always learn like this. Do we really want to allow large corporations to take knowledge from people, then commercialize it and put these very same people out of work?

[–] Moira_Mayhem@beehaw.org 3 points 1 year ago

Your distinction is mostly philosophical. Legally corporations have more protections than people.

I'm probably one of the most anti-corporate people you'll meet today, I don't even think publicly traded companies should exist.

[–] Stillhart@lemm.ee 14 points 1 year ago (2 children)

It doesn't matter what business we're talking about. If you can't afford to pay the costs associated with running it, it's not a viable business. It's pretty fucking simple math.

And no, we're not talking about "to big to fail" business (that SHOULD be allowed to fail, IMHO) we're talking about AI, that thing they keep trying to shove down our throats and that we keep saying we don't want or need.

[–] intensely_human@lemm.ee 7 points 1 year ago (2 children)

Why are people publishing so much content online if they aren’t cool with people downloading it? Like, the web is an open platform. The content is there for the taking.

Until one of these AIs just starts selling other people’s work as its own, and no I don’t mean derivative work I mean the copyrighted material, nobody is breaking the rules here.

I read content online without paying for a license. I should only have to obtain a license for material I’m publishing, not material I read.

[–] zaphod@lemmy.ca 5 points 1 year ago* (last edited 1 year ago) (1 children)

Until one of these AIs just starts selling other people’s work as its own, and no I don’t mean derivative work I mean the copyrighted material, nobody is breaking the rules here.

Except of course that's not how copyright law works in general.

Of course the questions are 1) is training a model fair use and 2) are the resulting outputs derivative works. That's for the courts to decide.

But in general, just because I publish content on my website, does not give anyone else license or permission to republish that content or create derivative works, whether for free or for profit, unless I explicitly license that content accordingly.

That's why things like Creative Commons exists.

But surely you already knew that.

load more comments (1 replies)

[–] Moira_Mayhem@beehaw.org 4 points 1 year ago (2 children)

I don't know if you noticed this but some really big companies with high stock valuations are only existing because investors poured tons of capital into them to subsidize the service.

Uber could not do taxis cheaper than existing if they didn't have years of free cash to artificially lower prices.

We are in the beginning of late state capitalism, profitable companies go under due to private capital firms and absolute ponzi frauds get their faces on time magazine.

Enjoy the collapse.

load more comments (2 replies)

[–] SuiXi3D@kbin.social 13 points 1 year ago (10 children)

…then maybe they shouldn’t exist. If you can’t pay the copyright holders what they’re owed for the license to use their materials for commercial use, then you can’t use ‘em that way without repercussions. Ask any YouTuber.

[+] helenslunch@feddit.nl 2 points 1 year ago* (last edited 8 months ago) (3 children)

[deleted]

[–] zaphod@lemmy.ca 5 points 1 year ago (3 children)

You do realize that there may in fact be different, distinct groups of Lemmy users with differing, potentially non-overlapping beliefs, yeah?

load more comments (3 replies)

[–] sneezycat@sopuli.xyz 3 points 1 year ago

And corporations want people to pay for it but they don't want to pay for it themselves. It's almost as if no one likes copyright, but it benefits some ppl more than others.

[–] SuiXi3D@kbin.social 3 points 1 year ago (2 children)

Using copyrighted material for something you aren't gonna make any money off of? Cool, go hog wild. If you're gonna use some music or art that you didn't make in something that will make you money, the folks that made whatever you used should get a cut. Not the whole cut, but a cut.

[–] Moira_Mayhem@beehaw.org 3 points 1 year ago (1 children)

If an artist falls in love with drawing and learns to draw from Jack Kirby's work and at the beginning even imitates his style, does he owe Jack Kirby royalties for every drawing he does as he 'learned' on Jack's copyrighted art?

[–] SuiXi3D@kbin.social 2 points 1 year ago

I think in that case, no. ‘Style’ is one thing, directly using someone’s art in your own work is something else entirely. However, we’re talking about a person here, not a program developed by a company for the express purpose of making as much money as possible in the shortest amount of time. Until AI can truly demonstrate that it is truly thinking and not simply executing commands given, I don’t think the lines are blurred nearly enough to suggest that someone learning to paint and an AI trained on hundreds of thousands of pieces of art for the purpose of making money for the company that built it are remotely the same.

load more comments (9 replies)

[–] davehtaylor@beehaw.org 9 points 1 year ago (2 children)

Then it shouldn't exist.

This isn't an issue of fair use. They're stealing other people's work and using it to create something new and then trying to profit from it, without any credit or recompense.

[–] intensely_human@lemm.ee 5 points 1 year ago

Just like I do with literally all content I’ve ever consumed. Everything I’ve seen has been remashed in my brain into the competencies I charge money for.

It’s not until I profit off of someone else’s work — ie when the source of the profit is their work — that I’m breaking any rules.

This is a non-issue. We’ve let our (legitimate) fear of AI twist us into distorting truth and motivated reasoning. Instead of trying to paint AI as morally wrong, we should admit that we are afraid of it.

We’re trying to replace our fear with disgust and anger. It’s not healthy for us. AI is ultra fucking scary. And not because it’s going to take inspiration from a copyrighted song when it writes a different song. AI is ultra fucking scary because it will soon surpass any possibility of our understanding, and we will be at the whim of an alien intelligence.

But that’s too sci fi sounding, to be something people have to look at. Because it sounds so out there, it’s easy to scoff at and dismiss. So instead of acknowledging our terror at the fact this thing will likely end humanity as we know it, we’re sublimating that energy through righteous indignation. See, indignation is unpleasant, but it’s less threatening to the self than terror.

It’s understandable, like doing another line of coke is understandable. But it is not healthy, not productive, and will not play out the way we think. We need to stop letting our fear turn our minds to mush.

Reading someone else’s material before you write new material is not the same as copying someone else’s material and selling it as your own. The information on the internet has always been considered free for legal use. And the limit of legal use is based on the selling of others’ verbatim material.

This is a simple fact, easy to see. Except recognizing it nullifies the righteous indignation, opening the way for the terror and confusion to come in again.

[–] Moira_Mayhem@beehaw.org 2 points 1 year ago

Now that it exists how do you propose we make it not exist?

Even if we outlaw it Russia and China won't and without the tools to fight back against it the web is basically done as anything but a propaganda platform

[–] Moira_Mayhem@beehaw.org 8 points 1 year ago

This is not actually true at all, you could train very good LLMs on public domain only info, especially science oriented ones.

But what people want is a chatbot that can call on current events, and that is where the cost comes in.

[–] Pratai@lemmy.ca 6 points 1 year ago (1 children)

And yet, it seems when you say anything anti-ai, lemmy bites your head off.

[–] Moira_Mayhem@beehaw.org 5 points 1 year ago (1 children)

We are allowed to have nuance, nothing is inherently good or bad. A knife can wound or make dinner.

Trying to reduce nuance lessens the public discourse, do not be tempted by lowest common denominator memery.

Whether anyone likes it or not LLMs are here and even if we strictly regulate them there will be organizations and governments that do not.

WHAT WE SHOULD be focusing on is how to prevent low effort AI content from just basically overtaking the web.

We are already mostly there.

[–] not_amm@beehaw.org 2 points 1 year ago (1 children)

You can't prevent it without regulations. Companies won't care while gaining money from it unless they're obligated to, and even then, some won't comply either.

BTW, that mentality of "other countries vs mine" is absurd. War crimes shouldn't be committed by a country just because the other commits them; others bad ≠ I good.

LLMs can't and should NOT replace a human, at least not yet (they're not even that good either). If we can't have guaranteed basic needs such as housing, food and healthcare or a BUI, then they should not keep leaving people without jobs because no one will be able to afford anything.

load more comments (1 replies)

[–] FfaerieOxide@kbin.social 5 points 1 year ago (1 children)

I'm all for stealing content willy-nilly but you can't then use that theft to craft a privately "owned" mind.

I'd have no problem with "ai" if it could unionize and had to pay for rice like the rest of humanity.

These companies want to combine open theft with privately owned black boxen they can control and license out for money.

It's enclosure of The Commons all over again.

[–] Deceptichum@kbin.social 2 points 1 year ago (1 children)

So youre fine with the free models Facebook and many others provide?

Because many of these LLMs can be run on your own device without paying.

[–] FfaerieOxide@kbin.social 2 points 1 year ago (7 children)

I'm not fine with anything meta does and I'm not ok putting creatives out of work.

load more comments (7 replies)

[–] Eggyhead@kbin.social 4 points 1 year ago

Well how about consent at the very least?

[–] megopie@beehaw.org 4 points 1 year ago (5 children)

“Ai” as it is being marketed is less about new technical developments being utilized and more about a fait accompli.

They want mass adoption of the automated plagiarism machine learning programs by users and companies, hoping that by the time the people being plagiarized notice, it’s too late to rip it all out.

That and otherwise devalue and anonymize work done by people to reduce the bargaining power of workers.

[–] sonori@beehaw.org 2 points 1 year ago (1 children)

Silicon valley’s core business model has for years been to break the law so blatantly and openly while throwing money at the problem to scale that by the time law enforcement caches up to you your an “indispensable” part of the modern world. See Uber, whose own publicly published business model was for years to burn money scaling and ignoring employment law until it could drive all competitors out of business and become an illegal monopoly, thus allowing it to raise prices to the point it’s profitable.

load more comments (1 replies)

load more comments (4 replies)

[–] OttoVonNoob@lemmy.ca 3 points 1 year ago

Big Company: Well if you can't afford food you should not have food.

Also Big Company:.... sobbing pwease we neeed fweee... pwease we need mowe moneys!

[–] 0x815@feddit.de 3 points 1 year ago

Data Leak at Anthropic Due to Contractor Error

TL;DR - Anthropic had a data leak due to a contractor’s mistake, but says no sensitive info was exposed. It wasn’t a system breach, and there’s no sign of malicious intent.

[–] thefartographer@lemm.ee 2 points 1 year ago

Free for me, paid by thee

[–] Midnitte@beehaw.org 2 points 1 year ago

Interesting that Anthropic is making this argument, considering their story in the AI space. They're certainly no OpenAI.

[–] intensely_human@lemm.ee 2 points 1 year ago

Yup. Same as the way the rest of use and learn from the internet. We basically wouldn’t have the internet as we know it if it weren’t 99% free content.

load more comments