this post was submitted on 26 Feb 2026

127 points (89.4% liked)

Technology

82066 readers

3475 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

127

Nvidia delivers first Vera Rubin AI GPU samples to customers — 88-core Vera CPU paired with Rubin GPUs with 288 GB of HBM4 memory apiece (www.tomshardware.com)

submitted 4 days ago by RegularJoe@lemmy.world to c/technology@lemmy.world

52 comments fedilink hide all child comments

top 50 comments

sorted by: hot top controversial new old

[–] Mynameisallen@lemmy.zip 94 points 4 days ago (2 children)

This is what all the parts we wanted went to

[–] Earthman_Jim@lemmy.zip 27 points 4 days ago (3 children)

Yeah, I wonder how long it will take them to clue in that no one wants to trade gaming for an AI fucking girlfriend ffs...

[–] Mynameisallen@lemmy.zip 16 points 4 days ago

Until the money stops pouring in I suppose

[–] Ebby@lemmy.ssba.com 10 points 4 days ago* (last edited 4 days ago) (2 children)

Idk... I'm a tad excited to buy one for peanuts on eBay in a couple years for a local smart home upgrade. Heck, when the bubble pops, maybe they can sell power from all those generators back to the city and lower our utility bills too. /S

[–] foggenbooty@lemmy.world 3 points 3 days ago (1 children)

I know you put /S, but for other people that read this it will not be an option. These only work in specialised servers that you will not he able to run at home (unless you're a mad scientists type).

[–] zebidiah@lemmy.ca 1 points 1 day ago (1 children)

I mean.... Xenon chips were server only originally, but eventually there started being enthusiast motherboards being produced you could drop one into, I feel like it's possible 3rd party Chinese manufacturing will supply a retrofit solution

[–] foggenbooty@lemmy.world 1 points 18 hours ago

Totally different. Go take a look at one of these things. Many of them aren't even GPUs you can slot into anything, they're totally custom and integrated into the main board which expects special cooling and interconnects. Like I said, if youre the mad scientists type that has a dedicated space for loud enterprise server racks, yeah, I'm sure you could figure out something. But this isn't going to be like a bunch or RAM and GPUs that you or I can use.

[–] Earthman_Jim@lemmy.zip 3 points 4 days ago

They might be useful for rendering, and I'd love to see how smoothly Teardown Game could run with all those cores.

[–] setsubyou@lemmy.world 5 points 4 days ago (1 children)

I mean if they came with a cool android body we could talk about it. It should at least be able to do cleaning and cooking. Otherwise my wife won’t like it.

[–] vacuumflower 2 points 3 days ago

It should at least be able to do cleaning and cooking.

So that's what we need android girlfriends for.

[–] roofuskit@lemmy.world 24 points 4 days ago (1 children)

Don't worry, you can rent them for $30 a month and stream all your video games.

[–] Mynameisallen@lemmy.zip 8 points 3 days ago

Not even, just the ones they deign to allow

[–] gnawmon@ttrpg.network 47 points 4 days ago (1 children)

so that's why my 5070 laptop has 8 GBs of VRAM...

my old 1080 also had 8 GBs of VRAM

[–] kittenzrulz123@lemmy.dbzer0.com 4 points 3 days ago (1 children)

Your 5070 laptop has 8gb of vram? My desktop 3060 has 12gb of vram and its not even the TI version.

[–] gnawmon@ttrpg.network 4 points 3 days ago

yuup

[–] Cocodapuf@lemmy.world 28 points 3 days ago (1 children)

Jesus fucking Christ, 288GB. And this is why I can't have 16?

[–] Corkyskog@sh.itjust.works 8 points 3 days ago

And you have to buy a rack of them with 72 of them.

[–] zebidiah@lemmy.ca 10 points 3 days ago

THIS is why we can't have nice things....

[–] xxce2AAb@feddit.dk 35 points 4 days ago

Goodbye, sweet hardware. You deserved better and so did we.

[–] phoenixz@lemmy.ca 15 points 3 days ago (2 children)

And none of us will be allowed to have them

Only datacenters and only fortune 500 companies will be able to use anything Nvidia

[–] Corkyskog@sh.itjust.works 5 points 3 days ago (1 children)

I mean if you have the 3 million to spend on a rack of them, I am sure they would allow you to have them.

I do wonder what happens a few years down the road when everyone are replacing their gpus with latest and greatest variants what happens to the old racks? Do they get sold for pennies on the dollar because everyone else doing AI wants cutting edge?

[–] eleitl@lemmy.zip 1 points 3 days ago

The failure rate is high for ML GPUs. The hardware is effectively consumables.

[–] eleitl@lemmy.zip 2 points 3 days ago

You can't do much with them, unless you're into deep leaning. And the power bill would bankrupt you. I wish I had a Cerebras box, but even the smallest one is 20 kW, liquid cooled.

[–] LoremIpsumGenerator@lemmy.world 6 points 3 days ago (1 children)

So this is where our future ram buy went into? Fuck this planet then 🤣

[–] eleitl@lemmy.zip 3 points 3 days ago

HBMx is a different product than DDRx/GDDRx, though parts of the fabbing are probably shared.

[–] RegularJoe@lemmy.world 18 points 4 days ago (2 children)

Nvidia's Vera Rubin platform is the company's next-generation architecture for AI data centers that includes an 88-core Vera CPU, Rubin GPU with 288 GB HBM4 memory, Rubin CPX GPU with 128 GB of GDDR7, NVLink 6.0 switch ASIC for scale-up rack-scale connectivity, BlueField-4 DPU with integrated SSD to store key-value cache, Spectrum-6 Photonics Ethernet, and Quantum-CX9 1.6 Tb/s Photonics InfiniBand NICs, as well as Spectrum-X Photonics Ethernet and Quantum-CX9 Photonics InfiniBand switching silicon for scale-out connectivity.

[–] TropicalDingdong@lemmy.world 21 points 4 days ago (5 children)

288 GB HBM4 memory

jfc..

Looking at the spec's... fucking hell these things probably cost over 100k.

I wonder if we'll see a generational performance leap with LLM's scaling to this much memory.

[–] AliasAKA@lemmy.world 18 points 4 days ago* (last edited 4 days ago) (1 children)

Current models are speculated at 700 billion parameters plus. At 32 bit precision (half float), that’s 2.8TB of RAM per model, or about 10 of these units. There are ways to lower it, but if you’re trying to run full precision (say for training) you’d use over 2x this, something like maybe 4x depending on how you store gradients and updates, and then running full precision I’d reckon at 32bit probably. Possible I suppose they train at 32bit but I’d be kind of surprised.

Edit: Also, they don’t release it anymore but some folks think newer models are like 1.5 trillion parameters. So figure around 2-3x that number above for newer models. The only real strategy for these guys is bigger. I think it’s dumb, and the returns are diminishing rapidly, but you got to sell the investors. If reciting nearly whole works verbatim is easy now, it’s going to be exact if they keep going. They’ll approach parameter spaces that can just straight up save things into their parameter spaces.

[–] in_my_honest_opinion@piefed.social 7 points 4 days ago (1 children)

Sure, but giant context models are still more prone to hallucination and reinforcing confidence loops where they keep spitting out the same wrong result a different way.

[–] AliasAKA@lemmy.world 4 points 3 days ago (3 children)

Sorry, I’m not saying that’s a good thing. It’s not just the context that’s expanding, but the parameter of the base model. I’m saying at some point you just have saved a compressed version of the majority of the content (we’re already kind of there) and you’d be able to decompress it even more losslessly. This doesn't make it more useful for anything other than recreating copyrighted works.

load more comments (3 replies)

[–] in_my_honest_opinion@piefed.social 6 points 4 days ago (2 children)

Fundamentally no, linear progress requires exponential resources. The below article is about AGI but transformer based models will not benefit from just more grunt. We're at the software stage of the problem now. But that doesn't sign fat checks, so the big companies are incentivized to print money by developing more hardware.

https://timdettmers.com/2025/12/10/why-agi-will-not-happen/

Also the industry is running out of training data

https://arxiv.org/html/2602.21462v1

What we need are more efficient models, and better harnessing. Or a different approach, reinforced learning applied to RNNs that use transformers has been showing promise.

load more comments (2 replies)

[–] boonhet@sopuli.xyz 4 points 4 days ago* (last edited 4 days ago)

LLMs can already use way more I believe, they don't really run them on a single one of these things.

The HBM4 would likely be great for speed though.

[–] Cocodapuf@lemmy.world 2 points 3 days ago

Lol, this was literally my exact response

https://lemmy.world/comment/22356808

I feel you man.

[–] panda_abyss@lemmy.ca 3 points 4 days ago

Yeah they’re going to cost as much as a house.

I think we’ll see much larger active portions of larger MOEs, and larger context windows, which would be useful.

The non LLM models I run would benefit a lot from this, but I don’t know of I’ll ever be able to justify the cost of how much they’ll be.

[–] yogurtwrong@lemmy.world 5 points 3 days ago (1 children)

The buzzwords make my head hurt. Sounds like a copypasta

[–] in_my_honest_opinion@piefed.social 4 points 3 days ago

Almost like an LLM wrote it...

[–] redsand@infosec.pub 7 points 3 days ago

Brick them all 🧱

[–] RizzRustbolt@lemmy.world 7 points 3 days ago (1 children)

But can it run Crysis?

[–] elucubra@sopuli.xyz 4 points 3 days ago

Can it run Doom?

[–] Hadriscus@jlai.lu 4 points 4 days ago (1 children)

Can't wait for it to hit secondhand market in november

[–] Cocodapuf@lemmy.world 5 points 3 days ago* (last edited 3 days ago) (2 children)

So we can do what? De solder the individual ram chips and populate them on custom dimms?

Pass.

It's too late for all of these rack mounted, AI oriented products, those resources are already spent, they're gone for us.

It's like they took the world's supply of high tech computer components and locked them in a room with a sign over the door that says "douchebags only". And even if you got into that room, those components are only compatible with douchebag OS, so even if you completely cleared that room of douchebags, you still have to throw all their useless party favors in the dumpster.

[–] in_my_honest_opinion@piefed.social 3 points 3 days ago

You scoff but this is already being done in China. They desolder good chips from bad cards and add them to a mule card.

https://overclock3d.net/news/gpu-displays/chinese-developers-create-modified-48gb-nvidia-rtx-4090d-and-32gb-rtx-4080-super-gpus-for-the-ai-cloud/

[–] vaultdweller013@sh.itjust.works 2 points 3 days ago

Bringus is gonna make a weird gaming computer by shoving one into a movie rental kiosk.

[–] Earthman_Jim@lemmy.zip 4 points 4 days ago (1 children)

who. fucking. cares.

[–] FaceDeer@fedia.io 31 points 4 days ago (1 children)

You're in a community called "Technology" and it's got a bunch of upvotes, so us, presumably.

[–] Earthman_Jim@lemmy.zip 3 points 4 days ago

The news is important, but when it comes to user-end AI in general, big fucking meh.

[–] fubarx@lemmy.world 2 points 4 days ago

Question is, how long before it makes it to the next DGX Spark? Some people don't have $10B to burn.

load more comments