this post was submitted on 03 Jun 2026
-6 points (36.4% liked)

HomeLab

220 readers
1 users here now

A homelab is a server or multiple server setup that resides in your home and where you host sevelra applications and virtualized systems for testing and developing

Its a sandbox environment where you can experience and break and fix things in with no repercussions while its down

This is a community where you can share, discuss, or post news relating to homelabs

founded 2 years ago
MODERATORS
 

Check what can you use and at what rate of token per seconds would it be... It has examples of many models and quantization levels. Huge resource!

top 5 comments
sorted by: hot top controversial new old
[–] Hexarei@beehaw.org 3 points 3 weeks ago

This doesn't seem to take into account CPU MoE, which can make a huge amount of difference - Running a bigger MoE model is better than a small model that fits in your GPU if you have the CPU resources. I run Qwen3.6 (the 30b/e4b version) in MoE at around 40 tok/s on my 5070+Ryzen 9 5950X, and it's way better than that tool's recommended 9b.

[–] nutbutter@discuss.tchncs.de 2 points 3 weeks ago (1 children)

This feels useless. At least for homelabbers, ollama's model page tells us more useful info. And if a newbie goes there they'll be misguided.

Also, there's a lot of people who use CPUs, they don't list anything about them at all. Like I cannot fit Gemma 4 on my GPU, but ollama offloads it to CPU, and even with small GPUs you can get good performance.

And for nearly all small models, it recommends RTX 5060. Which is a very stupid choice.

[–] B0rax@feddit.org 1 points 3 weeks ago (1 children)

What do you mean by „small gpu“?

I have not yet tried that, do you have any guidance? Or does „small gpu“ still mean >500€ GPU?

[–] nutbutter@discuss.tchncs.de 1 points 3 weeks ago

By small, I mean GPUs like outdated ones, laptop GPUs, or like GPUs with only 4GB or 6GB of VRAM.

[–] ComradePenguin@lemmy.ml 1 points 3 weeks ago

Interesting, I just have 8GB VRAM unfortunately. So can't run anything particularily useful for mye purpose 😔 The Gemma 4 E4B is quite good, but id like to run the 31B one