April 2026 AI Model Rankings: Anthropic Tops LMArena, GPT-5.5 Rules AA Index

Bottom Line

As of late April 2026, global AI model leaderboards show a “two-leaderboard, two-champions” pattern: Anthropic dominates LMArena (formerly Chatbot Arena) Elo rankings, while OpenAI’s GPT-5.5 series leads the Artificial Analysis Intelligence Index. Each reflects a different dimension of capability.

LMArena Elo: User Preference Rankings

Based on anonymous A/B voting, data as of April 24:

Rank	Model	Elo	Vendor
1	Opus 4.7 (thinking)	1503	Anthropic
2	Claude Opus 4.6 (thinking)	1503	Anthropic
3	Claude Opus 4.6	1496	Anthropic
4	Opus 4.7	1494	Anthropic
5	Gemini 3.1 Pro Preview	1493	Google DeepMind
6	Muse Spark	1492	Meta AI
7	Gemini 3.0 Pro	1486	Google DeepMind
8	grok-4.20-beta1	1482	xAI
9	gpt-5.4-high	1481	OpenAI
10	grok-4.20-beta-reasoning	1479	xAI

Key signals: Anthropic holds 3 of top 4. Meta Muse Spark enters top 10 for the first time since early 2025.

AA Intelligence Index: Standardized Benchmark Rankings

Aggregating 10 standardized benchmarks (coding, math, science, reasoning, agents), data as of April 25:

Rank	Model	Score	Vendor
1	GPT-5.5 (xhigh)	60	OpenAI
2	GPT-5.5 (high)	59	OpenAI
3	Opus 4.7 (max)	57	Anthropic
4	Gemini 3.1 Pro Preview	57	Google DeepMind
5	GPT-5.4 (xhigh)	57	OpenAI
6	GPT-5.5 (medium)	57	OpenAI
7	Kimi K2.6	54	Moonshot AI
8	MiMo-V2.5-Pro	54	Xiaomi
9	GPT-5.3 Codex (xhigh)	54	OpenAI
10	Muse Spark	52	Meta AI

Key signals: GPT-5.5 sweeps top 2, holds 4 of top 6. Kimi K2.6 (Moonshot AI) is the only Chinese model in top 10.

Selection Guide

Strongest overall benchmarks: GPT-5.5 (xhigh), AA score 60
Best user experience: Claude Opus 4.7 (thinking), LMArena 1503 Elo
Cost-effective: GPT-5.5 (medium), AA 57 at lower price
Chinese models: Kimi K2.6 at 54, highest-ranked domestic model
Open-source / semi-open: Muse Spark (Meta) at 52

Bottom Line

LMArena Elo: User Preference Rankings

AA Intelligence Index: Standardized Benchmark Rankings

Selection Guide

Sources

Related

Kimi K2.6 Tops Design Arena: Moonshot AI Surpasses All US Models in 3D Design

Qwen 3.6 Max BS Benchmark Review: Anti-Hallucination Capability Surpasses All OpenAI Models

Oxford/LLNL Chain-of-Thought Benchmark: GPT 95.7% Single, Collapses to 9.83% Chained