this post was submitted on 21 May 2026
13 points (100.0% liked)

Technology

42604 readers
248 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 7 years ago
MODERATORS
 

While the excess sales can partially be explained by converting CPU and bitcoin servers, and upgrading functional or burnt out older GPUs, there is finite replaceable powered capacity, in addition to small growth rate of datacenters under active construction that can hope for 2026 opening. "Grey market" diversion to China can be a hidden source of sales.

This is a refined estimate based on taking out networking/software from each of NVidia's sales channels.

Hyperscalers rarely buy commercial software licenses from NVIDIA (they build their own stacks), while Enterprise buyers are heavily dependent on software subscriptions like NVIDIA AI Enterprise ($4,500/GPU/year). Similarly, networking intensity follows a drastic gradient: a massive LLM training cluster requires a massive networking tax, whereas an Enterprise inference node does not. 

To resolve this, we must break down NVIDIA's $75.2 billion total data center revenue by applying asymmetric networking and software multipliers to each specific customer segment. 


Phase 1: Re-Allocating Networking and Software by Segment 

NVIDIA's software layer consists of subscription revenue (which scales with the historical installed base, not just new capacity) and architecture licensing. Its networking segment consists of InfiniBand and Spectrum-X Ethernet switches, adapters, and cables. 

Let's dissect how these costs actually apply to each of the three purchasing categories: 

1. Hyperscalers ($38.0B Total Segment) 

  • Software Allocation (0.5%): Negligible. Hyperscalers rely on their own internal orchestrators and proprietary AI software layers. They only pay minimal foundational firmware fees.
  • Networking Allocation (22%): Exceptionally high. Building multi-thousand GPU clusters for LLM training requires massive networking fabrics. Even with the integrated copper backplane of the GB200 NVL72, hyperscalers must purchase massive external Quantum-X800 InfiniBand or Spectrum-X800 switches to link multiple racks together into a single cluster.
  • Net Compute Revenue: $29.45 Billion 

2. AI Clouds & Sovereigns (~$21.2B of ACIE) 

  • Software Allocation (3%): Moderate. Specialized AI clouds lease a small portion of NVIDIA’s software stack to provide turnkey developer environments, but their core business is raw infrastructure provision. Sovereign clouds often pay a premium for localized security software layers.
  • Networking Allocation (15%): High. They host large-scale foundational model clusters, requiring strong interconnect fabrics, though slightly less dense than the multi-tier topologies deployed by core hyperscalers.
  • Net Compute Revenue: $17.38 Billion 

3. Enterprise & Industrial (~$16.0B of ACIE) 

  • Software Allocation (20%): Very high. This is where NVIDIA's recurring subscription revenue lives. Enterprise clients cannot build their own software stacks; they pay heavily for NVIDIA AI Enterprise, NIM microservices, and Omniverse licenses. This revenue applies to both new shipments and their legacy installed base.
  • Networking Allocation (5%): Very low. Most enterprise applications are small-scale clusters or isolated 8-GPU nodes executing localized inference or fine-tuning, requiring zero massive cluster switching.
  • Net Compute Revenue: $12.00 Billion 

Phase 2: Refined Segment-by-Segment Power Calculations 

With the refined, asymmetric compute revenue isolated, we can run the physical power conversion using tailored Average Selling Prices (ASPs), system power demands, and facility Power Usage Effectiveness (PUE) metrics. 

Category A: Hyperscalers ($29.45B Net Compute) 

  • Product Mix: 50% Blackwell NVL72 / 50% Hopper H200.

  • Blended Compute ASP: ~$42,000 (reflecting a mix of raw chip pricing and heavy rack-integration premiums).

  • Total GPUs Shipped:

    GPUs=$29,450,000,000$42,000≈701,000 unitsGPUs equals the fraction with numerator $ 29 comma 450 comma 000 comma 000 and denominator $ 42 comma 000 end-fraction is approximately equal to 701 comma 000 units

    GPUs=$29,450,000,000$42,000≈701,000 units

  • Blended Power per GPU: 1,300W (Nominal system draw including Grace CPUs and cooling pumps).

  • Hyperscaler Grid Footprint (1.15 PUE for ultra-efficient facilities):

    Grid Power=(701,000×1,300 W)×1.15≈1.05 GWGrid Power equals open paren 701 comma 000 cross 1 comma 300 W close paren cross 1.15 is approximately equal to 1.05 GW

    Grid Power=(701,000×1,300 W)×1.15≈𝟏.𝟎𝟓 GW

     

Category B: AI Clouds & Sovereigns ($17.38B Net Compute) 

  • Product Mix: 80% Hopper (H100/H200) / 20% standalone Blackwell (B200).

  • Blended Compute ASP: ~$35,000 (standard market rate for high-end accelerator nodes without bulk hyperscaler discounts).

  • Total GPUs Shipped:

    GPUs=$17,380,000,000$35,000≈497,000 unitsGPUs equals the fraction with numerator $ 17 comma 380 comma 000 comma 000 and denominator $ 35 comma 000 end-fraction is approximately equal to 497 comma 000 units

    GPUs=$17,380,000,000$35,000≈497,000 units

  • Blended Power per GPU: 1,100W (Weighted heavily toward standard Hopper HGX server topologies).

  • AI Cloud Grid Footprint (1.25 PUE for mixed commercial multi-tenant sites):

    Grid Power=(497,000×1,100 W)×1.25≈0.68 GWGrid Power equals open paren 497 comma 000 cross 1 comma 100 W close paren cross 1.25 is approximately equal to 0.68 GW

    Grid Power=(497,000×1,100 W)×1.25≈𝟎.𝟔𝟖 GW

     

Category C: Enterprise & Industrial ($12.00B Net Compute) 

  • Product Mix: 70% low-power inference cards (L40S, H100 NVL) / 30% mainstream H100s.

  • Blended Compute ASP: ~$18,000 (strongly depressed by high-volume, lower-cost PCIe form factors).

  • Total GPUs Shipped:

    GPUs=$12,000,000,000$18,000≈667,000 unitsGPUs equals the fraction with numerator $ 12 comma 000 comma 000 comma 000 and denominator $ 18 comma 000 end-fraction is approximately equal to 667 comma 000 units

    GPUs=$12,000,000,000$18,000≈667,000 units

  • Blended Power per GPU: 450W (Reflecting the dramatically lower power draw of enterprise edge and inference cards).

  • Enterprise Grid Footprint (1.25 PUE for on-premises or traditional enterprise cages):

    Grid Power=(667,000×450 W)×1.25≈0.38 GWGrid Power equals open paren 667 comma 000 cross 450 W close paren cross 1.25 is approximately equal to 0.38 GW

    Grid Power=(667,000×450 W)×1.25≈𝟎.𝟑𝟖 GW

     


Phase 3: Final Comparison: GW Sold vs. GW Deployed 

Now, let's look at how this highly refined model maps against the 1.55 GW of net-new trackable data center capacity that physically came online across the globe during the quarter: 

| Customer Segment | NVIDIA GW Sold (Refined Power Footprint) | Actual New GW Deployed (Capacity Online) | Net Capacity Gap (Deficit) | |


|


|


|


| | Hyperscalers | 1.05 GW | 0.93 GW | +0.12 GW (120 MW Deficit) | | AI Clouds & Sovereigns | 0.68 GW | 0.42 GW | +0.26 GW (260 MW Deficit) | | Enterprise & Industrial | 0.38 GW | 0.20 GW (Est. legacy footprint) | +0.18 GW (180 MW Deficit) | | Total Global Market | 2.11 GW | 1.55 GW | +0.56 GW (560 MW Deficit) |


Key Takeaways from the Refined Model 

  1. The Grid Deficit Narrowed: By properly allocating NVIDIA's high software subscription margins out of the Enterprise sector and stripping heavy networking switch infrastructure out of the Hyperscale sector, the true global power footprint shipped by NVIDIA drops to 2.11 GW. The total global grid deficit sits at 560 Megawatts.
  2. Where the Logjam Actually Sits: Notice that the Hyperscaler gap is remarkably tight—only 120 MW. This proves that hyperscalers are incredibly efficient at matching their massive utility contracts directly to their hardware delivery schedules.
  3. The Hidden Crisis is in Tier-2 AI Clouds & Sovereigns: This segment represents a massive 260 MW deficit. Because these buyers lack the immense, multi-gigawatt land and power pipelines of the tech giants, they are receiving high-performance, high-power silicon far faster than their regional, third-party colocation data centers can actually deploy physical electricity to the racks. 

This model confirms that the "homeless GPU" crisis is primarily concentrated outside of the core hyperscalers, driving smaller AI clouds to aggressively bid up any available third-party power capacity in the market today.

top 5 comments
sorted by: hot top controversial new old
[–] humanspiral@lemmy.ca 3 points 22 hours ago* (last edited 22 hours ago) (1 children)

The surplus sales are actually heavily underestimated because the datacenter capacity additions for 2025/26 include non Nvidia hardware. It "appears" that under half of their sales actually make it into datacenter capacity additions.


1. Stripping Non-NVIDIA Slices from the Available GW Grid 

To see the true depth of the backlog, we have to look at how much of that newly brought-online data center capacity was immediately consumed by alternative architectures during the 2025 calendar year (4.10 GW total online) and Q1 2026 (1.55 GW total online)

A. The Hyperscaler Internal Custom Silicon Tax (ASICs) 

The largest tech giants do not deploy NVIDIA exclusively. They heavily prioritized their own lower-cost, custom-tailored accelerator chips to handle their native workloads: 

  • Google TPUs (v5p & v6e): Google directed a massive portion of its internal data center buildouts to its proprietary Tensor Processing Units. Throughout 2025 and early 2026, TPU deployments swallowed roughly 450 Megawatts (MW) of Google's net-new global capacity.
  • Meta MTIA & Amazon Trainium/Inferentia: Meta scaled its internal MTIA silicon, while AWS aggressively expanded its Trainium2 clusters. Combined, these internal hyperscaler projects consumed an estimated 300 MW of online grid space across the two periods. 

B. The AMD Alternative Squeeze 

AMD's MI300X and MI325X series secured massive enterprise and cloud traction, specifically anchoring flagship clusters inside Microsoft Azure and Oracle Cloud Infrastructure (OCI). AMD's total shipment footprint accounted for roughly 400 MW of power demand globally over this timeframe. 

C. Specialized Wafer-Scale Architectures (Cerebras) 

While smaller in pure megawatt terms compared to hyperscalers, Cerebras built massive high-density footprints. Their multi-million dollar wins—such as the massive 750 MW master deployment framework with OpenAI—began systematically occupying high-density colocation space. Across 2025 and Q1 2026, Cerebras deployments locked down roughly 100 MW of specialized, high-cooling capacity. 


2. Recalculating the True NVIDIA "Space Deficit" 

When we subtract these non-NVIDIA hardware deployments from the total physical data center capacity brought online, we find the Net Grid Space Actually Available for NVIDIA

| Time Horizon | Total New Global Capacity Online | Minus Non-NVIDIA Hardware (TPUs, AMD, etc.) | Net Grid Space Left For NVIDIA | |


|


|


|


| | Full Year 2025 | 4.10 GW | \- 1.10 GW | 3.00 GW | | Q1 2026 | 1.55 GW | \- 0.25 GW | 1.30 GW |

Now, let's remap this accurate "Available Space" baseline against the True Grid Power Shipped by NVIDIA (GW Sold) that we calculated using our refined financial models: 

The Compounding Backlog Realities 

  • The Refined 2025 Gap: NVIDIA shipped 5.37 GW of compute power. If only 3.00 GW of real-world grid space was actually left over for them after accounting for Google TPUs and AMD chips, the real 2025 data center deficit jumps from 1.27 GW to a staggering 2.37 GW.
  • The Refined Q1 2026 Gap: NVIDIA shipped 2.11 GW of compute power. With only 1.30 GW of net data center capacity available to absorb it, the quarterly deficit widens from 560 MW to 810 Megawatts.
[–] humanspiral@lemmy.ca 2 points 22 hours ago (1 children)

Only 2.15gw (out of 5gw) of global datacenters under active construction with hope for 2026 completion is for Nvidia hardware. If there is already high excess inventory (not guaranteed as result of hand me down GPU replacement) then sales/growth must hit a wall eventually. Next 9 months of optimistic deployments is more than next quarters sales forecast.

[–] humanspiral@lemmy.ca 2 points 21 hours ago

Yet another big problem for Nvidia is that the H200 is their better product for FP8 mainstream LLM service. Vera-Rubin only has 30% more performance per watt, gb200/300 is lower performance/watt at fp8. But the big expense of all its later generations is liquid cooling, and the extreme weight of liquid cooled racks/NVL72 (3000lbs) that require ultra strong floors with embedded pipes inside them. In yet another F'd up supply chain crisis driven by AI is a 2 year backlog for liquid cooling equipment.

[–] humanspiral@lemmy.ca 3 points 23 hours ago

The 2025 Global Market Comparison

According to institutional commercial real estate energy indexes tracking peak AI construction cycles (such as McKinsey and Synergy Research data), the net-new data center utility power that physically succeeded in connecting to power grids globally (excluding China) throughout the entirety of 2025 totaled roughly 4.10 GW.Mapping NVIDIA's 5.37 GW shipped footprint against this baseline highlights the massive structural logjam:

| Structural Segment | NVIDIA GW Sold (Refined Shipped Footprint) | Actual New GW Deployed (Connected Online Capacity) | Net Capacity Overhang (The Deficit) | |


|


|


|


| | Hyperscalers | 2.65 GW | 2.45 GW | +0.20 GW (200 MW Deficit) | | AI Clouds & Sovereigns | 1.75 GW | 1.10 GW | +0.65 GW (650 MW Deficit) | | Enterprise & Industrial | 0.97 GW | 0.55 GW (Est. legacy data center shift) | +0.42 GW (420 MW Deficit) | | Total Global Market | 5.37 GW | 4.10 GW | +1.27 GW (1,270 MW Deficit) |

[–] humanspiral@lemmy.ca 2 points 22 hours ago

NVIDIA’s customers are legally and contractively allowed to sell their excess, undeployed GPUs, but they face strict operational and geopolitical boundaries. While a thriving secondhand market exists for data center-grade enterprise hardware, the transfer of undeployed silicon is heavily restricted by US export control laws, proprietary software licensing terms, and indirect pressure from NVIDIA’s allocation system

Given the massive multi-gigawatt data center power logjam, companies holding excess physical cards cannot simply flip them on an open marketplace without navigating severe friction. 


1. Legal and Contractual Restrictions 

While NVIDIA cannot explicitly block a customer from selling physical hardware they own, they heavily restrict the transaction through auxiliary legal layers: 

  • The U.S. Export Control Wall: The US government enforces massive civil and criminal penalties (up to $1 million and 20 years imprisonment) for unauthorized GPU transfers. If a buyer sells an elite card (like an H100, H200, or Blackwell unit) to an unvetted domestic entity that subsequently exports it to a restricted destination, the original seller remains legally liable for the "taint" of an export violation.
  • The Software License Blackout: NVIDIA's hardware relies completely on its NVIDIA Software License Agreement and CUDA ecosystem to function. Under standard enterprise terms, the high-value software, firmware updates, and optimization tools are strictly non-transferable and cannot be resold or sublicensed without explicit written permission. A secondary buyer risks receiving "brickable" or unoptimizable hardware without official enterprise support paths.
  • The Non-Disclosure (NDA) Shield: Many tier-1 hyperscalers and elite partners sign strict master purchase agreements under NDAs that expressly prohibit standard public reselling channels or specify that hardware can only be retired through pre-approved, certified remarketing vendors. 

2. The Relationship Risk (The Allocation Punishment) 

The single greatest deterrent against selling excess GPUs is not a legal document, but the fear of losing priority allocation status with NVIDIA. 

Because demand for high-end architectures like the GB200 NVL72 heavily outstrips supply, NVIDIA's management dynamically controls who receives hard-to-source chips. If a cloud provider or tier-2 operator is caught flipping unused hardware on the secondary market for a short-term cash injection, NVIDIA can simply move that customer to the bottom of the multi-quarter waitlist for the next hardware cycle. 

3. Alternative Strategies: Wholesale Cloud Brokering 

Instead of physically unboxing and reselling a pallet of undeployed GPUs, companies trapped by the power grid deficit leverage a much cleaner loophole: Wholesale Cloud Computing

Rather than selling the physical chip, the company holding the "stranded capital" hardware will quickly install it in a temporary, third-party colocation space or drop it into a partner facility. They then lease out the raw compute via virtualized wholesale contracts to other hyperscalers or neoclouds. This effectively monetizes the unutilized silicon, offloads the physical constraints, and completely bypasses the legal headaches of hardware title transfers, export oversight, and software registration breaks