771
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 22 Aug 2023
771 points (95.6% liked)
Technology
59179 readers
2731 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
But they don’t purchase the data. That’s the whole problem.
And copyright is absolutely violated by training off it. It’s being used to make money and no longer falls under even the widest interpretation of free use.
You need to expand on how learning from something to make money is somehow using the original material to make money. Considering that's how art works in general, I'm having a hard time taking the side of "learning from media to make your own is against copyright". As long as they don't reproduce the same thing as the original, I don't see any issues with it. If they learned from Lord of the rings to then make "the Lord of the rings" then yes, that'd be infringement. But if they use that data to make a new IP with original ideas, then how is that bad for the world/ artists.
Creating an AI model is a commercial work. They’re made to make money. Now these models are dependent on other artists data to train on. The models would be useless if they weren’t able to train on anything.
I hold the stance that using copyrighted data as part of a training set is a violation of copyright. That still hasn’t been fully challenged in court, so there’s no specific legal definition yet.
Due to the requirement of copywritten materials to make the model function I feel that they are using copyrighted works in order to build a commercial product.
Also AI doesn’t learn. LLMs build statistical models based on sentence structure of what they’ve seen before. There’s no level of understanding or inherent knowledge, and there’s nothing new being added.