this post was submitted on 28 Jan 2025
887 points (94.5% liked)
memes
15826 readers
3489 users here now
Community rules
1. Be civil
No trolling, bigotry or other insulting / annoying behaviour
2. No politics
This is non-politics community. For political memes please go to !politicalmemes@lemmy.world
3. No recent reposts
Check for reposts when posting a meme, you can only repost after 1 month
4. No bots
No bots without the express approval of the mods or the admins
5. No Spam/Ads
No advertisements or spam. This is an instance rule and the only way to live.
A collection of some classic Lemmy memes for your enjoyment
Sister communities
- !tenforward@lemmy.world : Star Trek memes, chat and shitposts
- !lemmyshitpost@lemmy.world : Lemmy Shitposts, anything and everything goes.
- !linuxmemes@lemmy.world : Linux themed memes
- !comicstrips@lemmy.world : for those who love comic stories.
founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
i mean, if it's not directly factually inaccurate, than, it is open source. It's just that the specific block of data they used and operate on isn't published or released, which is pretty common even among open source projects.
AI just happens to be in a fairly unique spot where that thing is actually like, pretty important. Though nothing stops other groups from creating an openly accessible one through something like distributed computing. Which seems to be a fancy new kid on the block moment for AI right now.
Is it common? Many fields have standard, open datasets. That's not the case here, and this data is the most important part of training an LLM.