view the rest of the comments
Selfhosted
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.
-
No spam posting.
-
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.
-
Don't duplicate the full text of your blog or github here. Just post the link for folks to click.
-
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
-
No trolling.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
The best/easiest way to get started with a self-hosted LLM is to check out this repo:
https://github.com/oobabooga/text-generation-webui
Its goal is to be the Automatic1111 of text generators, and it does a fair job at it.
A good model that's said to rival gpt-3.5 is the new Falcon model. The full sized version is too big to run on a single GPU, but the 7b version "only" needs about 16GB.
https://huggingface.co/tiiuae/falcon-7b
There's also the Wizard-uncensored model that is popular.
https://huggingface.co/ehartford/Wizard-Vicuna-13B-Uncensored
There are a ton of models out there with new ones popping up every day. You just need to search around. The oobabooga repo has a few models linked in the readme also.
Edit: there's also h20gpt, which seems really promising. I'm going to try it out in the next couple days.
https://github.com/h2oai/h2ogpt
Note that when using llama-derived models, such as vicuna, you are bound by their license to only use them for "research" purposes.
If you want an unrestricted version, go for open-llama or RedPajama.
Falcon is less restrictive and only wants a cut of profits if they exceed 1 million dollars, but I'd wager that fully unrestricted is the way to go.
Falcon has switched to Apache 2.0 and removed the commercial limit.
Sorry, I must've missed that somehow, then my comment only applies to llama and its direct derivates.