Ant Group Ling-2.6 Fully Open-Sourced: Flash Activates Only 7.4B, 1T Flagship Built for "Execution-First"

Bottom Line

Ant Group (Inclusion AI / Ant Ling) has open-sourced two models in late April: Ling-2.6-Flash and Ling-2.6-1T, both using MoE architecture, MIT license, with BF16/FP8/INT4 precision variants. Compared to models of similar parameter scale, Ling’s core differentiation lies in extremely low activation parameters and execution-oriented design — not a benchmark-padding machine, but purpose-built for Agent workloads.

Dimension	Ling-2.6-Flash	Ling-2.6-1T
Total Parameters	104B	~1T
Active Parameters	7.4B	~63B
Context Window	256K	256K+
License	MIT	MIT
SWE-Bench Verified	62	67+
BFCL-V4	67	72+
TAU2-Bench (Telecom)	93.86	95+

What Happened

Ling-2.6-Flash: Ultra-Lightweight Agent Model

April 29: Ling-2.6-Flash weights officially open-sourced. 104B total parameters, only 7.4B activated per inference — meaning it can run on consumer-grade GPUs (single RTX 4090 with INT4 quantization).
Built on Ling 2.0 with hybrid linear attention mechanism, replacing the previous GQA attention with a more efficient hybrid approach, significantly reducing inference latency.
SWE-Bench Verified 62, BFCL-V4 67, TAU2-Telecom 93.86 — all hard-scenario metrics, not academic leaderboard-padding datasets.

Ling-2.6-1T: Flagship Execution Model

Following Flash, Ling-2.6-1T was released on the same day. ~1T total parameters, ~63B active parameters.
Core design philosophy is “Execution-First”: reducing token waste during reasoning, skipping verbose internal monologue-style thinking, outputting executable results directly.
Community feedback: many frontier models’ reasoning outputs are essentially wasted tokens — users pay for every internal thought, but task completion rates don’t improve proportionally. Ling-2.6-1T directly addresses this problem.

Why It Matters

1. A New Variable in the Chinese MoE Camp

Previously, the main Chinese open-source MoE models were DeepSeek V4 (1.6T/37B active) and Kimi K2.6 (~1T). Ling-2.6’s entry means:

Flash tier (7.4B active): fills the gap for consumer GPU-runnable Agent models in Chinese open source
1T tier (63B active): comparable active parameter count to DeepSeek V4, but with a more radical design philosophy — fewer tokens consumed, same task completion rate

2. Cost Revolution for Agent Scenarios

What does Ling-2.6-Flash’s 7.4B active parameters mean?

Compared to GPT-5.5, a single API call’s reasoning output may consume hundreds of extra tokens
Ling-2.6-Flash reduces each call cost to 1/10 or lower through streamlined reasoning paths
For Agent workloads requiring high-frequency calls, this is the key threshold from “experimental” to “production-grade”

3. Ant’s Open-Source Strategy Shift

Ant Ling previously focused on API services (Ling Chat). This full open-source release means:

Shift from closed services to ecosystem building
MIT license (not Apache 2.0 or commercial), allowing unrestricted commercial use
Available on both Hugging Face and ModelScope, covering international and domestic developers

Actionable Advice

Who Should Pay Attention

Agent developers: Ling-2.6-Flash’s 7.4B active parameters make it ideal for low-latency Agent calls
Cost-sensitive teams: High API call volume scenarios, Flash’s cost advantage is significant
Consumer GPU users: INT4 quantized version runs 104B MoE on a single RTX 4090

How to Get Started

# Hugging Face installation
pip install transformers accelerate

# Load Ling-2.6-Flash (INT4 quantized)
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "InclusionAI/Ling-2.6-Flash",
    device_map="auto",
    torch_dtype="auto"
)

Hugging Face: huggingface.co/InclusionAI
ModelScope: modelscope.cn/organization/AntLingAGI
Official deployment docs: github.com/AntLingAGI/Ling

Caveats

As a newly open-sourced model, community tooling (Ollama, vLLM adapters) may still be catching up
SWE-Bench 62 vs DeepSeek V4’s 68+ — pure coding ability still has a gap
The 1T version has high hardware requirements; try Flash first to evaluate direction