Core Conclusion
In April 2026, open-source models achieved a historic breakthrough: Moonshot AI’s Kimi K2.6 surpassed Claude Opus 4.7 on LiveBench, becoming the best-performing open-source model on this benchmark.
LiveBench is known for continuously updating its test questions—regularly replacing them to prevent models from gaining inflated scores through training data memorization. Beating Opus 4.7 on this “anti-cheating dynamic” evaluation means Kimi K2.6’s generalization ability has reached closed-source flagship levels.
Data Comparison
| Dimension | Kimi K2.6 | Claude Opus 4.7 | Gap |
|---|---|---|---|
| LiveBench | Wins | Baseline | K2.6 leads |
| SWE-Bench | ~80% | 87.6% | ~7.6pp behind |
| Input price ($/1M tokens) | $0.80-0.95 | $5.00 | K2.6 is 5-6x cheaper |
| Output price ($/1M tokens) | $3.60-4.00 | $25.00 | K2.6 is 6-7x cheaper |
| License | Open source | Closed source | — |
Why It Matters
- Pricing power shift: When open models match closed models on key benchmarks, closed vendors’ pricing power is significantly weakened
- Ecosystem prosperity: Open weights mean anyone can build specialized variants
- Local deployment: For data-sensitive enterprises, Kimi K2.6 provides near-flagship local inference
Action Recommendations
- Budget-conscious teams: Kimi K2.6 is currently the most cost-effective flagship-level open model
- Coding scenarios: If SWE-Bench is your core metric, Claude Opus 4.7 (87.6%) still leads, but the gap is narrowing
- Multimodal scenarios: K2.6’s native multimodal capability makes it a cleaner alternative to “LLM + visual encoder” setups
- Moonshot top-up window: Current bonus promotion ends May 3 ($100-$299 gets 20% bonus, $1,000+ gets 30% bonus)