C
ChaoBro

2026 AI Capex Surges to $715 Billion as HBM Chip Supply Sells Out

2026 AI Capex Surges to $715 Billion as HBM Chip Supply Sells Out

Bottom Line First

The four tech giants (Amazon, Google, Meta, Microsoft) are projected to spend $715 billion on AI capex in 2026, with nearly all incremental spend AI-driven.

Meanwhile, the bottleneck for AI compute is shifting from GPUs to HBM (High Bandwidth Memory)—Micron's CEO admitted on the latest earnings call that 2026 HBM supply is completely sold out, meeting only 50-65% of customer demand.

Data Breakdown

2026 AI Capex by Company

Company 2026 Capex Ceiling YoY Growth Primary Use
Amazon (AWS) ~$200B Accelerating (Bedrock spend reaccelerated to fastest in 15 quarters) GPU clusters + data centers + power
Google ~$190B Sustained growth TPU + GPU + data center infrastructure
Microsoft ~$190B Maintaining high levels Azure AI + OpenAI infrastructure
Meta ~$135B Significant increase Llama training + AI ads + metaverse
Total ~$715B

Supply Chain Bottleneck Shift

Phase Bottleneck Current Status
2023-2024 GPU capacity (NVIDIA A100/H100) Massive capacity expansion, easing
2025 Advanced packaging (CoWoS) TSMC expanding
2026 HBM memory Industry-wide sold out, supply crunch

HBM Market Landscape

Supplier Market Share 2026 Capacity Status Notes
SK Hynix ~50% Q1 revenue tripled YoY, surpassed 50T KRW for first time Announced $13B expansion plan
Micron ~25% Can only meet 50-65% of demand Multi-year volume and pricing agreements locked
Samsung ~20% Catching up HBM3E production ramping
Others ~5%

Why AI Is Becoming "Memory-First"

Micron's CEO delivered a key signal on the earnings call:

"AI is becoming a memory-first industry—because models and agents need longer 'thinking' time and more context retention."

Technical Logic

Token Throughput = HBM Size × HBM Bandwidth

Longer agent thinking → Larger context windows → KV Cache bloat → Exponential HBM demand growth

When models scale from 7B to 70B parameters, and context windows from 8K to 128K, HBM demand grows far beyond linear.

SanDisk's AI Reversal

SanDisk's earnings also validate this trend:

  • Last year: loss of $0.30 per share; this quarter: $23.41 per share (vs. $14.50 estimate)
  • Revenue: $5.95B (vs. $4.70B estimate)
  • 5 AI companies signed long-term supply agreements

The storage industry has reversed from losses to windfall profits driven by AI demand.

Landscape Judgment

Short-term Impact (2026)

  • HBM supply tightness will persist throughout the year, driving up GPU inference costs
  • Model optimization will increasingly focus on memory efficiency: quantization, MoE, KV cache compression
  • Domestic alternatives (e.g., CXMT) will receive policy acceleration

Medium-term Trends (2027-2028)

  • HBM4 standard release may ease some supply pressure
  • CXL memory pooling technology may change memory allocation paradigms
  • "Compute-in-memory" chip architectures may become a new competitive dimension

Investment Logic

Track Certainty Upside Representative Targets
HBM manufacturers ★★★★★ ★★★☆☆ SK Hynix, Micron
GPU vendors ★★★★☆ ★★★★☆ NVIDIA, AMD
Data center REITs ★★★★☆ ★★☆☆☆ Data center real estate funds
Memory optimization software ★★★☆☆ ★★★★★ Quantization/compression toolchains

Action Recommendations

For AI Application Teams

  • Immediately evaluate your model's memory usage efficiency; prioritize frameworks supporting quantized inference
  • Consider MoE architecture models for significantly reduced HBM demand at equivalent performance
  • Watch KV cache optimization techniques (PagedAttention, FlashDecoding)

For Hardware Procurement

  • HBM supply tightness may last 12-18 months; consider locking in supply contracts early
  • Evaluate AMD MI series as an NVIDIA alternative (better price-performance in some scenarios)

For Developers

  • Learn model quantization techniques (INT4/INT8) to run larger models on limited hardware
  • Watch memory optimization updates in local inference frameworks like llama.cpp and MLX