Verdict
Qwen 3.6-27B is one of the strongest open-source coding models under 30B parameters. 27 billion dense parameters (not MoE), Apache 2.0 license, runs on 18GB RAM — a MacBook Pro or consumer GPU is enough. Ties Claude 4.5 Opus on Terminal-Bench, SWE-bench near 50%.
Best for local coding assistance and offline inference; not for scenarios needing large-scale multimodal or million-token context — this is a coding specialist.
Test Dimensions
Coding
Qwen 3.6-27B’s core selling point is coding. Released April 20, 2026, it topped 6 coding benchmarks simultaneously:
- SWE-bench Pro: ~50% (vs GPT-5.5’s 58.6%)
- Terminal-Bench 2.0: Tied with Claude 4.5 Opus
- Skills Bench: First place
27B dense model at this level means training data quality and efficiency far exceed past models of similar scale.
Deployment Cost
- Memory: ~18GB at FP16, comfortable on RTX 4090 (24GB)
- Quantized: ~8GB at INT4, runs on M2/M3 MacBook
- Speed: 50+ tokens/s on single 4090, faster than any cloud API end-to-end
- Cost: Zero API fees
Limitations
- Context: Long context supported but not as reliable as GPT-5.5 at million-token scale
- Multimodal: Text-only, no image understanding
- Agentic capability: Less stable than frontier models in complex multi-step workflows
Comparison
| Model | Params | Arch | SWE-bench | Memory | License |
|---|---|---|---|---|---|
| Qwen 3.6-27B | 27B | Dense | ~50% | 18GB | Apache 2.0 |
| Llama 3.1 70B | 70B | Dense | ~40% | 40GB | Llama |
| DeepSeek V4 | 1.6T | MoE | ~58% | Multi-GPU | Apache 2.0 |
Recommendations
Individual developers: If you have a 24GB GPU, Qwen 3.6-27B is the best local coding model to deploy.
Code assistant integration: Use as VS Code / Cursor local backend.
Teams needing multimodal or large context: Qwen 3.6-27B is insufficient — pair with cloud frontier models.