Open Source Models Closing In on Closed Source: What a 6-Point Gap Means

The Signal

The latest Intelligence Index data reveals an underappreciated trend: the capability gap between Chinese open-source models and global closed-source flagships is rapidly converging.

Model	Intelligence Index	Open Source	Price Positioning
GPT-5.5	60	Closed	$5/$30 per M
Gemini 3 / Claude	57	Closed	$3.50/$15 per M
Kimi K2.6	54	Open	~$1.70/$3 per M
MiMo V2.5 Pro	54	Open	MIT License
DeepSeek V4 Pro	52	Open	$2.20/$3.48 per M
GLM-5.1	~50	Open	Subscription
MiniMax M2.7	~49	Open	Low-cost

The gap between GPT-5.5 and Kimi K2.6 is only 6 points. Given that Kimi K2.6’s API costs just 1/10 of GPT-5.5, this cost-performance curve is steep enough to change most enterprises’ model selection decisions.

The Practical Meaning of a 6-Point Gap

The Intelligence Index was designed to comprehensively evaluate model capabilities in real-world scenarios — not memorized benchmark scores, but a weighted score across reasoning, coding, instruction following, long context, and more.

What does a 6-point gap mean?

In 80% of daily development scenarios, users cannot tell the difference.

A developer sharing their “budget AI package” on VEX put it plainly:

“I use DeepSeek V4 Flash for coding — the free tier is enough for daily use. When I need reasoning power, I switch to Pro, pay-per-use, and it costs just a few bucks a month.”

This isn’t theoretical “good enough” — it’s a choice in real production environments. When Kimi K2.6 beat Claude Opus 4.7 on LiveBench (a dynamic anti-cheating evaluation), the narrative of closed-source models’ “capability moat” began to crumble.

The Catch-Up Path of Open Source Models

Looking at the Intelligence Index trajectory:

2025 Q2: GPT-5.0 (50) vs DeepSeek V3 (38) → 12-point gap
2025 Q4: GPT-5.2 (55) vs DeepSeek V4 (45) → 10-point gap
2026 Q1: GPT-5.5 (60) vs Kimi K2.6 (54) → 6-point gap

The catch-up pace is accelerating. With the gap shrinking by 2-4 points every six months, open-source models could reach the current GPT-5.5 level by the end of 2026.

But this isn’t a simple “more parameters = better” story. Both Kimi K2.6 and MiMo V2.5 Pro use MoE (Mixture of Experts) architecture, achieving trillion-level total parameters while keeping active parameters around 50B. This means inference costs can be drastically reduced without sacrificing capability.

The Overlooked Variable: Practical Gap

The US CAISI agency’s evaluation report stated that DeepSeek V4 Pro’s comprehensive capability “lags the frontier by about 8 months.” This judgment is partially reflected in the Intelligence Index — 52 points is indeed below 60.

But the “8-month gap” interpretation needs full context:

GPT-5.5 is an iteration of GPT-5.0 released last August, and DeepSeek V4 Pro’s capability has already caught up to that version
In coding, Chinese language understanding, and long-text processing, domestic models perform in the same tier as international flagships
Open weights + local deployment capability is something closed-source models can never provide

One developer’s summary was precise:

“Parameters aren’t lacking, benchmark scores aren’t lacking — so where’s the gap? The biggest gap is real-world practice. But if your scenario doesn’t need the frontier’s 100% capability, then 92% capability at 1/10 the price is the better choice.”

Landscape Assessment

The Intelligence Index data is rewriting a fundamental assumption: that closed-source models’ capability advantage is permanent.

When open-source models approach closed-source flagships within 6 points while costing 1/5 to 1/10 the price, market competition logic shifts from “who’s the strongest” to “who’s the best fit.”

The cascading effects of this shift:

Enterprise procurement: Moving from “buy the most expensive” to “allocate by scenario” — core reasoning with GPT-5.5, daily development with DeepSeek, long documents with Kimi
Individual developers: Multi-model routing becomes a standard skill — knowing how to orchestrate models matters more than mastering a single one
Model vendors: Closed-source vendors must prove that the “6-point gap” has irreplaceable value in specific scenarios, otherwise price stratification will directly translate into market share loss

Action Items

If you’re evaluating model migration: Test Kimi K2.6 or DeepSeek V4 Pro in 20% of your real business scenarios first — the 6-point Intelligence Index gap is likely imperceptible in daily use
If you’re making model procurement decisions: Don’t just look at absolute Intelligence Index scores — calculate “cost per Intelligence point” — Kimi K2.6 costs about $0.055/M token per point, GPT-5.5 costs about $0.50/M token per point, a 9x difference
If you’re building Agent applications: Open-source MoE models have even more pronounced cost advantages in Agent scenarios, because Agents typically require massive token consumption, magnifying the per-unit cost impact

The Signal

The Practical Meaning of a 6-Point Gap

The Catch-Up Path of Open Source Models

The Overlooked Variable: Practical Gap

Landscape Assessment

Action Items

Related

Zhipu GLM-5.1 Released: 600 Iterations of Continuous Optimization, A New Domestic Choice for Long-Horizon Agent Tasks

Google Launches Gemini Enterprise Agent Platform: 200+ Models, Built-in Orchestration—Directly Competing with Anthropic and OpenAI in the Enterprise AI Arena

DeepSeek Slashes API Cache Prices to 1/10: V4 Series Cuts Make Million-Token Context Truly Usable