Technology

976 readers

70 users here now

Share interesting Technology news and links.

Rules:

No paywalled sites at all.
News articles has to be recent, not older than 2 weeks (14 days).
No external video links, only native(.mp4,...etc) links under 5 mins.
Post only direct links.

To encourage more original sources and keep this space commercial free as much as I could, the following websites are Blacklisted:

Al Jazeera;
NBC;
CNBC;
Substack;
Tom's Hardware;
ZDNet;
TechSpot;
Ars Technica;
Vox Media outlets(including Axios, due to new changes related to trackers on their website);
Engadget;
TechCrunch;
Gizmodo;
Futurism;
PCWorld;
ComputerWorld;
Mashable;
Hackaday;
WCCFTECH;
Neowin;
Jacobin;
Yahoo;
Freethink;
Big Think;
Newsweek.

More sites will be added to the blacklist as needed.

Encouraged:

Archive links in the body of the post.
Linking to the direct source, instead of linking to an article talking about the source.

Misc:

Relevant Lemmy Communities:

founded 1 year ago

MODERATORS

irelephant@programming.dev

Company accidentally spent $500 million on Claude AI in one month after forgetting usage limits (techstartups.com)

submitted 1 day ago by codeinabox@programming.dev to c/Technology@programming.dev

16 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] dan@upvote.au 3 points 1 day ago* (last edited 1 day ago) (1 children)

Not really. The state of the art models are huge, even the open-weight ones. You really don't want to quantize below 4-bit, and even that's a bit of a stretch... Ideally you'd use at least 8-bit to get good results with these models when used for coding.

GLM-5.1 needs around 400GB VRAM at 4-bit quantization. Apple aren't making the Mac Studio with 512GB unified RAM any more, so you'd need something like 5 x Nvidia A100 80GB to run a model like this.

Kimi K2.6 is around the same size.

[–] mindbleach@sh.itjust.works 1 points 1 day ago

Distillation works better than quantization, to the point Qwen recently out-benchmarked its 397B model with a 27B model, two months apart. Arguably the only reason to train comically large models is that this is a decent strategy for finding very small models.