this post was submitted on 24 Feb 2026
51 points (91.8% liked)
Opensource
5640 readers
401 users here now
A community for discussion about open source software! Ask questions, share knowledge, share news, or post interesting stuff related to it!
⠀
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
But you're otherwise disgusted by the fact that material is plagiarized without consent to begin with...
...Right, FaceDeer?
You went digging through my Reddit comments to find a two-month-old thread, that must have taken a lot of effort. But I'm afraid I don't see what the relevance of it is, aside from a general "it's about AI". The bulk of the comments I wrote there were about water usage.
I'm genuinely puzzled. Are you saying that deduplicating data is "hiding unethical behaviour?" It's actually intended for improving the model's performance, having a model spit out exact copies of its training data means you've produced a hugely expensive and wasteful re-implementation of copy-and-paste rather than a generative AI. The whole point of generative AI is to produce novel outputs.