Bottom Line
As of late April 2026, global AI model leaderboards show a “two-leaderboard, two-champions” pattern: Anthropic dominates LMArena (formerly Chatbot Arena) Elo rankings, while OpenAI’s GPT-5.5 series leads the Artificial Analysis Intelligence Index. Each reflects a different dimension of capability.
LMArena Elo: User Preference Rankings
Based on anonymous A/B voting, data as of April 24:
| Rank | Model | Elo | Vendor |
|---|---|---|---|
| 1 | Opus 4.7 (thinking) | 1503 | Anthropic |
| 2 | Claude Opus 4.6 (thinking) | 1503 | Anthropic |
| 3 | Claude Opus 4.6 | 1496 | Anthropic |
| 4 | Opus 4.7 | 1494 | Anthropic |
| 5 | Gemini 3.1 Pro Preview | 1493 | Google DeepMind |
| 6 | Muse Spark | 1492 | Meta AI |
| 7 | Gemini 3.0 Pro | 1486 | Google DeepMind |
| 8 | grok-4.20-beta1 | 1482 | xAI |
| 9 | gpt-5.4-high | 1481 | OpenAI |
| 10 | grok-4.20-beta-reasoning | 1479 | xAI |
Key signals: Anthropic holds 3 of top 4. Meta Muse Spark enters top 10 for the first time since early 2025.
AA Intelligence Index: Standardized Benchmark Rankings
Aggregating 10 standardized benchmarks (coding, math, science, reasoning, agents), data as of April 25:
| Rank | Model | Score | Vendor |
|---|---|---|---|
| 1 | GPT-5.5 (xhigh) | 60 | OpenAI |
| 2 | GPT-5.5 (high) | 59 | OpenAI |
| 3 | Opus 4.7 (max) | 57 | Anthropic |
| 4 | Gemini 3.1 Pro Preview | 57 | Google DeepMind |
| 5 | GPT-5.4 (xhigh) | 57 | OpenAI |
| 6 | GPT-5.5 (medium) | 57 | OpenAI |
| 7 | Kimi K2.6 | 54 | Moonshot AI |
| 8 | MiMo-V2.5-Pro | 54 | Xiaomi |
| 9 | GPT-5.3 Codex (xhigh) | 54 | OpenAI |
| 10 | Muse Spark | 52 | Meta AI |
Key signals: GPT-5.5 sweeps top 2, holds 4 of top 6. Kimi K2.6 (Moonshot AI) is the only Chinese model in top 10.
Selection Guide
- Strongest overall benchmarks: GPT-5.5 (xhigh), AA score 60
- Best user experience: Claude Opus 4.7 (thinking), LMArena 1503 Elo
- Cost-effective: GPT-5.5 (medium), AA 57 at lower price
- Chinese models: Kimi K2.6 at 54, highest-ranked domestic model
- Open-source / semi-open: Muse Spark (Meta) at 52