1092

CEO Steve Huffman says tech giants should not be able to trawl Reddit’s huge store of data for free. But that information came from users, not the company

That “corpus of data” is the content posted by millions of Reddit users over the decades. It is a fascinating and valuable record of what they were thinking and obsessing about. Not the tiniest fraction of it was created by Huffman, his fellow executives or shareholders. It can only be seen as belonging to them because of whatever skewed “consent” agreement its credulous users felt obliged to click on before they could use the service.

Ouch

you are viewing a single comment's thread
view the rest of the comments
[-] constantokra@lemmy.one 56 points 1 year ago

Wide op for ai scraping and nothing are not the only two options. They could easily limit api calls to what would be good for single users or mods and have each user generate their own key. Apps could let users input their key. Most users wouldn't bother and would switch to their app anyway so it would get them 95% or what they claim to want without being a dick about it.

[-] CrateDane@feddit.dk 43 points 1 year ago

Plus AI companies can just scrape reddit without using the API. It's still a website after all.

[-] JustZ@lemmy.world 13 points 1 year ago

They want the timing of how long a user looks at something. They can't scrape that from third party apps.

[-] AustralianSimon@lemmy.world 2 points 1 year ago

Yes you can. PC emulation of apps is common.

[-] asexualchangeling@lemmy.ml 4 points 1 year ago

For how much longer though? I wouldn't put it past them to try to make it only available through an app

[-] dxxth@lemmy.world 4 points 1 year ago

If the data is that important to them that they kill the site, then they're more dumb than I think. Apps can be scraped too. It isn't even difficult.

[-] PuffyPanda@lemmy.world 2 points 1 year ago

I highly doubt Reddit is gonna shut down their website.

[-] Nahlej@lemmy.world 4 points 1 year ago

I saw a post saying they were testing restricting mobile access to only through the app.

[-] PuffyPanda@lemmy.world 1 points 1 year ago

Oh yeah, they’ve done that already. I don’t think they’ll extend that to actual web tho

[-] FanciestPants@lemmy.world 5 points 1 year ago

I'm not sure if I wasted my time, but I spent a few hours today editing all of my posts on Reddit to be a single comma or period. I didn't comment or post a lot by any means, but just got irritated enough to try to keep from contributing in any way to Spez profiting off of user provided content.

[-] spark947@lemm.ee 2 points 1 year ago

Can't shreddit do this in bulk? I am considering doing it for my comments, but I think I will just leave them up there. I did have a great time on reddit until they announced their API changes, so I will leave them with that much. But I did get a backup of everything I wrote using bulk downloader.

But I am still considering just doing a shreddit just for kicks.

[-] lemba@discuss.tchncs.de 1 points 1 year ago

Yeah, I did the same thing a few days ago. I used the browser add-on called Reddit Enhancement Suite to delete all my posts and comments. Instructions: https://www.alphr.com/how-to-delete-all-reddit-posts/

[-] penguinv@lemmy.world 2 points 1 year ago

so sad. Not opposing but like burning a forest.

[-] spark947@lemm.ee 1 points 1 year ago

Honestly, I think the sad truth is that reddit is bleeding money, and every action they take from here on out will be about recruiting whales and driving off everyone else. That's steve's brilliant business strategy - make reddit p2w.

[-] Pika@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

that's how they did it. They put a 10 request a minute on bots and a higher oauth limit (100) for individuals. large User client type apps could have somewhat easily converted over to that system but due to time constraint they didn't. I do think they extorted their third party devs sure but, honestly the individual user limit isn't super unreasonable as long as you aren't liking or disliking every post. the search api is 100 posts per Api request, it was more the no NSFW and the no advertising limits they put on it that sucked

edit: its actually 10 or 100 per minute not hour

[-] spark947@lemm.ee 1 points 1 year ago

It's not that simple, because the third party apps ship with a single api key. So I used Relay for reddit, and used the same api key as everyone else on that app. You could create an app, and then have everyone make their own key, but that is just asking for trouble. Definitely too technical for most people, and you would probably need to put in billing info for a scenario where you go above the free-tier call limit.

[-] Pika@lemmy.world 1 points 1 year ago* (last edited 1 year ago)

Yeah but if you're going to use the oauth 2 method you don't use the same API key as everyone, how that works is you authorize your account with the bot, the company gives you a bearer token and then that token is what's used for rate limits. The Bot client token is not used in that process, the oauth2 bearer token is

this is taken from the reddit Api docs: As of July 1, 2023, we will enforce two different rate limits for those eligible for free access usage of our Data API. The limits are:

If you are using OAuth for authentication: 100 queries per minute (QPM) per OAuth client ID
If you are not using OAuth for authentication: 10 QPM

so apperently I undershot it, it's actually 100 requests per minute not per hour like I originally thought it was

[-] spark947@lemm.ee 0 points 1 year ago

Well, I don't know how the Reddit API works, but what you described is generally bad practice, as is my understanding. The Oauth token's allow the app to perform actions on the behalf of authenticated users, but they still need to use the reddit API, and I imagine an API key, to perform those actions. You generally aren't supposed to use Oauth as a access authentication mechanism.

At least pricing is per Oauth key, but still, the pricing burden is still going to fall on the developers for these apps who reddit now views as their "competitors", despite making a product that supported reddit's business for years.

[-] Pika@lemmy.world 1 points 1 year ago

Oauth 2 is an authorization standard, that's basically what it is meant for. It's intended to be used as a identification system for a client to be able to tell a first party hey I'm me through the usage of a third party without ever giving the third party to have your password.

Discord, Facebook/Meta, Google(most services), Soundcloud, all those use Auth 2 based API's, oauth 2 is used basically everywhere for the same focus that Reddit is trying to do

Like you said it can be dangerous if you authorize a third party app, honestly I'm willing to bet that rif and Apollo both used the oauth2 API at least in some part, otherwise I don't think it would have been able to allow you to upvote or downvote posts or post comments as you. A good way to tell if it was using it or not is if you had to login and it brought you to a page that said authorize this app with Reddit, if it showed that you were using oauth 2

this post was submitted on 17 Jun 2023
1092 points (98.8% liked)

Lemmy.World Announcements

28381 readers
2 users here now

This Community is intended for posts about the Lemmy.world server by the admins.

Follow us for server news 🐘

Outages 🔥

https://status.lemmy.world

For support with issues at Lemmy.world, go to the Lemmy.world Support community.

Support e-mail

Any support requests are best sent to info@lemmy.world e-mail.

Report contact

Donations 💗

If you would like to make a donation to support the cost of running this platform, please do so at the following donation URLs.

If you can, please use / switch to Ko-Fi, it has the lowest fees for us

Ko-Fi (Donate)

Bunq (Donate)

Open Collective backers and sponsors

Patreon

Join the team

founded 1 year ago
MODERATORS