The Signal
The latest Intelligence Index data reveals an underappreciated trend: the capability gap between Chinese open-source models and global closed-source flagships is rapidly converging.
| Model | Intelligence Index | Open Source | Price Positioning |
|---|---|---|---|
| GPT-5.5 | 60 | Closed | $5/$30 per M |
| Gemini 3 / Claude | 57 | Closed | $3.50/$15 per M |
| Kimi K2.6 | 54 | Open | ~$1.70/$3 per M |
| MiMo V2.5 Pro | 54 | Open | MIT License |
| DeepSeek V4 Pro | 52 | Open | $2.20/$3.48 per M |
| GLM-5.1 | ~50 | Open | Subscription |
| MiniMax M2.7 | ~49 | Open | Low-cost |
The gap between GPT-5.5 and Kimi K2.6 is only 6 points. Given that Kimi K2.6’s API costs just 1/10 of GPT-5.5, this cost-performance curve is steep enough to change most enterprises’ model selection decisions.
The Practical Meaning of a 6-Point Gap
The Intelligence Index was designed to comprehensively evaluate model capabilities in real-world scenarios — not memorized benchmark scores, but a weighted score across reasoning, coding, instruction following, long context, and more.
What does a 6-point gap mean?
In 80% of daily development scenarios, users cannot tell the difference.
A developer sharing their “budget AI package” on VEX put it plainly:
“I use DeepSeek V4 Flash for coding — the free tier is enough for daily use. When I need reasoning power, I switch to Pro, pay-per-use, and it costs just a few bucks a month.”
This isn’t theoretical “good enough” — it’s a choice in real production environments. When Kimi K2.6 beat Claude Opus 4.7 on LiveBench (a dynamic anti-cheating evaluation), the narrative of closed-source models’ “capability moat” began to crumble.
The Catch-Up Path of Open Source Models
Looking at the Intelligence Index trajectory:
2025 Q2: GPT-5.0 (50) vs DeepSeek V3 (38) → 12-point gap
2025 Q4: GPT-5.2 (55) vs DeepSeek V4 (45) → 10-point gap
2026 Q1: GPT-5.5 (60) vs Kimi K2.6 (54) → 6-point gap
The catch-up pace is accelerating. With the gap shrinking by 2-4 points every six months, open-source models could reach the current GPT-5.5 level by the end of 2026.
But this isn’t a simple “more parameters = better” story. Both Kimi K2.6 and MiMo V2.5 Pro use MoE (Mixture of Experts) architecture, achieving trillion-level total parameters while keeping active parameters around 50B. This means inference costs can be drastically reduced without sacrificing capability.
The Overlooked Variable: Practical Gap
The US CAISI agency’s evaluation report stated that DeepSeek V4 Pro’s comprehensive capability “lags the frontier by about 8 months.” This judgment is partially reflected in the Intelligence Index — 52 points is indeed below 60.
But the “8-month gap” interpretation needs full context:
- GPT-5.5 is an iteration of GPT-5.0 released last August, and DeepSeek V4 Pro’s capability has already caught up to that version
- In coding, Chinese language understanding, and long-text processing, domestic models perform in the same tier as international flagships
- Open weights + local deployment capability is something closed-source models can never provide
One developer’s summary was precise:
“Parameters aren’t lacking, benchmark scores aren’t lacking — so where’s the gap? The biggest gap is real-world practice. But if your scenario doesn’t need the frontier’s 100% capability, then 92% capability at 1/10 the price is the better choice.”
Landscape Assessment
The Intelligence Index data is rewriting a fundamental assumption: that closed-source models’ capability advantage is permanent.
When open-source models approach closed-source flagships within 6 points while costing 1/5 to 1/10 the price, market competition logic shifts from “who’s the strongest” to “who’s the best fit.”
The cascading effects of this shift:
- Enterprise procurement: Moving from “buy the most expensive” to “allocate by scenario” — core reasoning with GPT-5.5, daily development with DeepSeek, long documents with Kimi
- Individual developers: Multi-model routing becomes a standard skill — knowing how to orchestrate models matters more than mastering a single one
- Model vendors: Closed-source vendors must prove that the “6-point gap” has irreplaceable value in specific scenarios, otherwise price stratification will directly translate into market share loss
Action Items
- If you’re evaluating model migration: Test Kimi K2.6 or DeepSeek V4 Pro in 20% of your real business scenarios first — the 6-point Intelligence Index gap is likely imperceptible in daily use
- If you’re making model procurement decisions: Don’t just look at absolute Intelligence Index scores — calculate “cost per Intelligence point” — Kimi K2.6 costs about $0.055/M token per point, GPT-5.5 costs about $0.50/M token per point, a 9x difference
- If you’re building Agent applications: Open-source MoE models have even more pronounced cost advantages in Agent scenarios, because Agents typically require massive token consumption, magnifying the per-unit cost impact