Github is owned by Microsoft, so don't worry, it's going to get worse
Programming
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Rules
- Follow the programming.dev instance rules
- Keep content related to programming in some way
- If you're posting long videos try to add in some form of tldr for those who don't want to watch videos
Wormhole
Follow the wormhole through a path of communities !webdev@programming.dev
Maybe charge OpenAI for scrapes instead of screwing over your actual customers.
Probably getting hammered by ai scrapers
you mean, doin' what microsoft and their ai 'partners' do to others?
Yeah but they're allowed to do it because they have brazillions of dollars.
The funny thing is that rate limits won't help them with genai scrapers
Everything seems to be. There was a period where you could kinda have a sane experience browsing over a VPN or otherwise using a cloud service IP range endpoint but especially the past 6 months or so things have gotten worse exponentially by the week. Everything is moving behind cloudflare or other systems
I honestly don't really see the problem here. This seems to mostly be targeting scrapers.
For unauthenticated users you are limited to public data only and 60 requests per hour, or 30k if you're using Git LFS. And for authenticated users it's 60k/hr.
What could you possibly be doing besides scraping that would hit those limits?
You might behind a shared IP with NAT or CG-NAT that shares that limit with others, or might be fetching files from raw.githubusercontent.com as part of an update system that doesn't have access to browser credentials, or Git cloning over https:// to avoid having to unlock your SSH key every time, or cloning a Git repo with submodules that separately issue requests. An hour is a long time. Imagine if you let uBlock Origin update filter lists, then you git clone something with a few modules, and so does your coworker and now you're blocked for an entire hour.
60 requests per hour per IP could easily be hit from say, uBlock origin updating filter lists in a household with 5-10 devices.
I hit those many times when signed out just scrolling through the code. The front end must be sending off tonnes of background requests
This doesn't include any requests from the website itself
If Microsoft knows how to do one thing well, it’s killing a successful product.
I came here looking for this comment. They bought the service to destroy it. It's kind of their thing.
I see the "just create an account" and "just login" crowd have joined the discussion. Some people will defend a monopolist no matter what. If github introduced ID checks à la Google or required a Microsoft account to login, they'd just shrug and go "create a Microsoft account then, stop bitching". They don't realise they are being boiled and don't care. Consoomer behaviour.
Probably because of AI agents. This is why we can’t have nice things.
Good thing I moved all my repos from git[lab|hub] to Codeberg recently.
60 req/hour for unauthenticated users
That's low enough that it may cause problems for a lot of infrastructure. Like, I'm pretty sure that the MELPA emacs package repository builds out of git, and a lot of that is on github.
That’s low enough that it may cause problems for a lot of infrastructure.
Likely the point. If you need more, get an API key.
No no, no no no no, no no no no, no no there's no limit
Until there will be.
I think people are grossly underestimating the sheer size and significance of the issue at hand. Forgejo will very likely eventually get to the same point Github is at right now, and will have to employ some of the same safeguards.
Except Forgejo is open source and you can run your own instance of it. I do, and it's great.
This going to fuck over obtanium?
RIP yocto builds
LOL!!!! RIP GitHub
EDIT: trying to compile any projects from source that use git submodules will be interesting. eg ROCm has more than 60 submodules to pull in 💀
The Go module system pulls dependencies from their sources. This should be interesting.
Even if you host your project on a different provider, many libraries are on github. All those unauthenticated Arch users trying to install Go-based software that pulls dependencies from github.
How does the Rust module system work? How does pip?
Crazy how many people think this is okay, yet left Reddit cause of their API shenanigans. GitHub is already halfway to requiring signing in to view anything like Twitter (X).
Open source repositories should rely on p2p. Torrenting repos is the way I think.
Not only for this. At any point m$ could take down your repo if they or their investors don't like it.
I wonder if it would already exist and if it could work with git?
Git is p2p and distributed from day 1. Github is just a convenient website. If Microsoft takes down your repo, just upload to another system. Nothing but convenience will be lost.
The project's official repo should probably exist in a single location so that there is an authoritative version. At that point p2p is only necessary if traffic for the source code is getting too expensive for the project.
Personally I think the source hut model is closest to the ideal set up for OSS projects. Though I use Codeberg for my personal stuff because I'm cheap and lazy
Wow so surprising, never saw this coming, this is my surprised face. :-l
Its always blocked me from searching in firefox when I'm logged out for some reason.