Qwen 3.6 Tops AI Intelligence Index: How a 27B Open Model Takes on Closed-Source Giants

Qwen 3.6 Tops AI Intelligence Index: How a 27B Open Model Takes on Closed-Source Giants

The Data

On April 30, 2026, Artificial Analysis updated its Intelligence Index rankings. Qwen 3.6 27B scored 46, becoming the absolute leader among open models under 150B parameters. On the same day, Vals Index placed it in the top 8 overall, among all open models and behind only a few ultra-large parameter models.

ModelIntelligence IndexParametersOpenCost per 1M Output
Qwen 3.6 27B4627B dense✅ Apache 2.0~$0.20 (local)
Gemma 4 31B3931B dense~$0.01 (local)
Llama 4 Scout42~17B MoE~$0.15 (local)
Claude Opus 4.7~48closed$25.00
GPT-5.5~47closed$10.00

Key finding: Qwen 3.6 27B’s Intelligence Index score is already very close to GPT-5.5 and Claude Opus 4.7, but the cost difference is orders of magnitude.

Why 27B?

What does 27 billion parameters mean? A MacBook Pro M4 with 24GB RAM (~$2,500) can run it at 4-bit quantization. This is not a lab model requiring GPU clusters—it’s a “compact powerhouse” every developer can run on their desk.

Community testing shows Qwen 3.6 27B has reached Opus 4.5-level performance on agentic tasks. One developer who ran it for a full day on an M4 MacBook Pro concluded: “This was science fiction 18 months ago.”

What’s the Trade-off?

Artificial Analysis also revealed a critical data point: Qwen 3.6 27B consumes approximately 3.7x more output tokens to complete the full Intelligence Index test, costing about 21x more than Gemma 4 31B.

This is not a flaw—it’s a design trade-off. A larger dense model (27B vs 31B MoE) activates all parameters on every inference pass, ensuring output consistency and predictability—exactly what agentic workflows need. The cost is higher token consumption.

Landscape Assessment

The competitive logic of open models is shifting from a “parameter race” to an “efficiency race.” Qwen 3.6 27B’s strategy is clear:

  1. Not the biggest, but the most practical: 27B is the upper limit for consumer hardware
  2. Not the cheapest, but the most reliable: Dense architecture guarantees stability for agentic tasks
  3. Not all-purpose, but specialized: Targets closed-source flagships in coding and agentic scenarios

Action Recommendations

  • Local developers: If your work centers on coding assistance and agentic tasks, Qwen 3.6 27B is currently the best value on consumer-grade hardware
  • Enterprise deployment: Apache 2.0 license + no-cloud inference = the top choice for data compliance scenarios
  • Wait-and-see: Gemma 4 31B’s agentic capabilities are still improving—it’s already at 39 on the Intelligence Index at 1/21 the cost of Qwen