ERNIE 5.1 Preview Breaks into LMArena Global Top 15: The Sole Chinese Model突围

Key Conclusion

The LMSYS Chatbot Arena’s latest ranking on April 30 shows Baidu’s ERNIE 5.1 Preview scoring 1476 points, ranking first domestically and entering the global top 15. This is currently the only Chinese model in the global Top 15, ranking above GPT-5.5 and DeepSeek-V4-Pro.

Meanwhile, Zhipu GLM-5.1 and Kimi K2.6 have entered the “passed entry tier” in coding agent scenarios, forming a three-way competitive landscape with ERNIE 5.1 among Chinese models.

LMArena Text Leaderboard: Latest Landscape

Rank	Model	Score	Vendor	Notes
1-5	GPT-5.5 and other frontier models	1500+	OpenAI, etc.	Global leaders
~10	ERNIE 5.1 Preview	1476	Baidu	Only Chinese model in Top 15
—	GPT-5.5	<1476	OpenAI	Surpassed by ERNIE 5.1
—	DeepSeek-V4-Pro	<1476	DeepSeek	Surpassed by ERNIE 5.1

ERNIE 5.1’s key breakthrough is in pure text conversation quality—the hardest metric to “game” in LMArena’s crowdsourced blind evaluation system, where real users vote on anonymous model responses.

Chinese Model “Big Three” Positioning

From developer feedback, Chinese models have formed a clear division of labor:

First Tier (Passed Entry):

GLM-5.1 (Zhipu) — Strongest in coding agent scenarios, but experienced garbled text/repetition issues at high concurrency + long context (70K+ tokens); Zhipu has published a post-mortem
Kimi K2.6 (Moonshot AI) — Tied with GLM-5.1, strong agent capabilities
ERNIE 5.1 Preview (Baidu) — Strongest in text conversation quality, backed by LMArena data

Second Tier (Not Yet Passed Entry):

DeepSeek-V4-Pro, Qwen 3.6 Plus, Tencent Hunyuan HY-3, etc.

This stratification shows: Chinese models are no longer about “which is better” but “which model for which scenario”—closely mirroring the smartphone market evolution from 2012 to 2016.

Why This Ranking Matters

LMArena’s credibility: Unlike vendor-reported benchmarks, LMArena uses real user blind evaluations that are hard to manipulate
Text vs. Multimodal: In 2026’s multimodal and Agent hype, ERNIE 5.1 proves that pure text conversation quality remains an independent competitive dimension
Baidu’s AI inflection point: The ERNIE series has long been seen as “large but not refined”; the 5.1 Preview performance shows Baidu found a breakthrough in foundational text models.

Action Recommendations

Chinese long-text tasks: ERNIE 5.1 Preview is worth priority testing, especially for conversation quality and Chinese comprehension
Coding Agent scenarios: GLM-5.1 and Kimi K2.6 remain more mature choices, but watch Zhipu’s high-concurrency bug fixes
Cost-sensitive scenarios: DeepSeek-V4-Pro and Qwen 3.6 Plus still offer strong cost-performance advantages

LMArena rankings will continue updating. Whether ERNIE 5.1 can maintain this position in its full release remains to be seen, but as the first Top 15 breakthrough for Chinese models on a global authoritative leaderboard, the signal is clear enough.

Key Conclusion

LMArena Text Leaderboard: Latest Landscape

Chinese Model “Big Three” Positioning

Why This Ranking Matters

Action Recommendations

Related

MiniMax M2.7 Deep Dive: The Model That Trains Itself

DeepSeek V4 Pro API 75% Off, Unlocks 1M Context in Claude Code / OpenClaw

Moonshot AI Announces Kimi K3: 2.5 Trillion Parameters, Targeting Global Top-Tier Models