Selfhosted
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam posting.
-
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
-
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
-
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
-
No trolling.
-
No low-effort posts. This is subjective and will largely be determined by the community member reports.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
view the rest of the comments
In my experience even a site with low legitimate traffic will eventually buckle under the torrent of bots and scrapers if it's up long enough to get indexed by search engines, so the longer my stuff is out there the more I anticipate I will need DDoS protection.
I've got bot detection setup in Nginx on my VPS which used to return 444 (Nginx for "close the connection and waste no more resources processing it), but I recently started piping that traffic to Nepenthes to return gibberish data for them to train on.
I documented a rough guide in the comment here. Of relevance to you are the two
.conffiles at the bottom. In thedeny-disallowed.conf, change the line forreturn 301 ...toreturn 444I also utilize firewall and fail2ban in the VPS to block bad actors, overly-aggressive scrapers, password brute forces, etc and the link between the VPS and my homelab equipment never sees that traffic.
In the case of a DDoS, I've done the following:
Granted, I'm not running anything mission-critical, just some services for friends and family, so I can deal with a little downtime.
I have something similar with fail2ban + hidden buttons. If the requester goes and clicks on the hidden buttons on the main site, it gets into a rabbit hole. After 3 requests, it gets banned for a bit. Usually stops the worst offenders. OpenAI and some of the scrapers are the worst.
Google/bing, I do actually see them hit robots.txt then jump off, which is what they should be going.
Oooooh. That's smart. I mostly host apps, but in theory, I should be able to dynamically modify the response body and tack on some HTML for a hidden button and do that.
I used to disallow everything in robots.txt but the worst crawlers just ignored it. Now my robots.txt says all are welcome and every bot gets shunted to the tarpit 😈
Nice! Thats another way to do it. 😀
I know others use Arabis(?) I think thats what it called. The anime girl one that does a calc on top. Ive never had good luck with it. I think bot are using something to get around and it messes with my requests. Might also be my own fiddling.
You probably mean Anubis.
Woops yes!
I've run a publicly accessible low-legitimate-traffic website that has been indexed by Google and others from my home network for >20 years without anything buckling so far. I don't even have a great connection (30mbps upstream).
Maybe I'm just lucky?
Consider what a DDOS attack looks like to Cloudflare, then consider what your home server can actually handle.
There's likely a very large gap between those two points.
For me, my server will start to suffer long before traffic reaches the level of a modern DDOS attack.