this post was submitted on 25 Apr 2026
0 points (NaN% liked)
Selfhosted
60253 readers
452 users here now
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.
Rules:
-
Be civil.
-
No spam.
-
Posts are to be related to self-hosting.
-
Don't duplicate the full text of your blog or readme if you're providing a link.
-
Submission headline should match the article title.
-
No trolling.
-
Promotion posts require active participation, with an account that is at least 30 days old. F/LOSS without a paywall has exceptions, with requirements. See the rules link for details.
Resources:
- selfh.st Newsletter and index of selfhosted software and apps
- awesome-selfhosted software
- awesome-sysadmin resources
- Self-Hosted Podcast from Jupiter Broadcasting
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
founded 3 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
Not all programs allow usage of multiple gpus as far as I know, some are not capable of splitting the llm in multiple vrams or something
Yes, it is. But I have llama-swap, openweb-ui. If you spend some time on the llama-swap configuration, then the you have a good chance to run the model on 2 cards is through llama.cpp. The winnings, however, will not be x2 of course and will fall non-linearly from the number of cards. And you need motherboard with good PCI-E lines (2 pci-e x16 or more). But it's still cheaper than one large card. Example:
There is a less stable but more productive one: --split-mode row
P.S. By the way, one RX9070XT on my instance translates posts and comments. You can test it if you want. =)