C
ChaoBro

2026 AI Capex Surges to $715 Billion as HBM Chip Supply Sells Out

2026 AI Capex Surges to $715 Billion as HBM Chip Supply Sells Out

Bottom Line First

The four tech giants (Amazon, Google, Meta, Microsoft) are projected to spend $715 billion on AI capex in 2026, with nearly all incremental spend AI-driven.

Meanwhile, the bottleneck for AI compute is shifting from GPUs to HBM (High Bandwidth Memory)—Micron’s CEO admitted on the latest earnings call that 2026 HBM supply is completely sold out, meeting only 50-65% of customer demand.

Data Breakdown

2026 AI Capex by Company

Company2026 Capex CeilingYoY GrowthPrimary Use
Amazon (AWS)~$200BAccelerating (Bedrock spend reaccelerated to fastest in 15 quarters)GPU clusters + data centers + power
Google~$190BSustained growthTPU + GPU + data center infrastructure
Microsoft~$190BMaintaining high levelsAzure AI + OpenAI infrastructure
Meta~$135BSignificant increaseLlama training + AI ads + metaverse
Total~$715B

Supply Chain Bottleneck Shift

PhaseBottleneckCurrent Status
2023-2024GPU capacity (NVIDIA A100/H100)Massive capacity expansion, easing
2025Advanced packaging (CoWoS)TSMC expanding
2026HBM memoryIndustry-wide sold out, supply crunch

HBM Market Landscape

SupplierMarket Share2026 Capacity StatusNotes
SK Hynix~50%Q1 revenue tripled YoY, surpassed 50T KRW for first timeAnnounced $13B expansion plan
Micron~25%Can only meet 50-65% of demandMulti-year volume and pricing agreements locked
Samsung~20%Catching upHBM3E production ramping
Others~5%

Why AI Is Becoming “Memory-First”

Micron’s CEO delivered a key signal on the earnings call:

“AI is becoming a memory-first industry—because models and agents need longer ‘thinking’ time and more context retention.”

Technical Logic

Token Throughput = HBM Size × HBM Bandwidth

Longer agent thinking → Larger context windows → KV Cache bloat → Exponential HBM demand growth

When models scale from 7B to 70B parameters, and context windows from 8K to 128K, HBM demand grows far beyond linear.

SanDisk’s AI Reversal

SanDisk’s earnings also validate this trend:

  • Last year: loss of $0.30 per share; this quarter: $23.41 per share (vs. $14.50 estimate)
  • Revenue: $5.95B (vs. $4.70B estimate)
  • 5 AI companies signed long-term supply agreements

The storage industry has reversed from losses to windfall profits driven by AI demand.

Landscape Judgment

Short-term Impact (2026)

  • HBM supply tightness will persist throughout the year, driving up GPU inference costs
  • Model optimization will increasingly focus on memory efficiency: quantization, MoE, KV cache compression
  • Domestic alternatives (e.g., CXMT) will receive policy acceleration
  • HBM4 standard release may ease some supply pressure
  • CXL memory pooling technology may change memory allocation paradigms
  • “Compute-in-memory” chip architectures may become a new competitive dimension

Investment Logic

TrackCertaintyUpsideRepresentative Targets
HBM manufacturers★★★★★★★★☆☆SK Hynix, Micron
GPU vendors★★★★☆★★★★☆NVIDIA, AMD
Data center REITs★★★★☆★★☆☆☆Data center real estate funds
Memory optimization software★★★☆☆★★★★★Quantization/compression toolchains

Action Recommendations

For AI Application Teams

  • Immediately evaluate your model’s memory usage efficiency; prioritize frameworks supporting quantized inference
  • Consider MoE architecture models for significantly reduced HBM demand at equivalent performance
  • Watch KV cache optimization techniques (PagedAttention, FlashDecoding)

For Hardware Procurement

  • HBM supply tightness may last 12-18 months; consider locking in supply contracts early
  • Evaluate AMD MI series as an NVIDIA alternative (better price-performance in some scenarios)

For Developers

  • Learn model quantization techniques (INT4/INT8) to run larger models on limited hardware
  • Watch memory optimization updates in local inference frameworks like llama.cpp and MLX