Kimi K2.6 Beats Opus 4.7 on LiveBench: The Era of Open Models Challenging Closed-Source Flagships

Kimi K2.6 Beats Opus 4.7 on LiveBench: The Era of Open Models Challenging Closed-Source Flagships

Core Conclusion

In April 2026, open-source models achieved a historic breakthrough: Moonshot AI’s Kimi K2.6 surpassed Claude Opus 4.7 on LiveBench, becoming the best-performing open-source model on this benchmark.

LiveBench is known for continuously updating its test questions—regularly replacing them to prevent models from gaining inflated scores through training data memorization. Beating Opus 4.7 on this “anti-cheating dynamic” evaluation means Kimi K2.6’s generalization ability has reached closed-source flagship levels.

Data Comparison

DimensionKimi K2.6Claude Opus 4.7Gap
LiveBenchWinsBaselineK2.6 leads
SWE-Bench~80%87.6%~7.6pp behind
Input price ($/1M tokens)$0.80-0.95$5.00K2.6 is 5-6x cheaper
Output price ($/1M tokens)$3.60-4.00$25.00K2.6 is 6-7x cheaper
LicenseOpen sourceClosed source

Why It Matters

  1. Pricing power shift: When open models match closed models on key benchmarks, closed vendors’ pricing power is significantly weakened
  2. Ecosystem prosperity: Open weights mean anyone can build specialized variants
  3. Local deployment: For data-sensitive enterprises, Kimi K2.6 provides near-flagship local inference

Action Recommendations

  • Budget-conscious teams: Kimi K2.6 is currently the most cost-effective flagship-level open model
  • Coding scenarios: If SWE-Bench is your core metric, Claude Opus 4.7 (87.6%) still leads, but the gap is narrowing
  • Multimodal scenarios: K2.6’s native multimodal capability makes it a cleaner alternative to “LLM + visual encoder” setups
  • Moonshot top-up window: Current bonus promotion ends May 3 ($100-$299 gets 20% bonus, $1,000+ gets 30% bonus)