Meta Open-Sources Llama 4 Scout: 17B/109B MoE Architecture, 10M Token Context for Just $0.08

Key Takeaway

Meta released Llama 4 Scout in late April — a 17B active / 109B total parameter MoE (Mixture of Experts) model with 16 expert routes.

10M token context window: Process a 300-page document without chunking
$0.08/M token input price: Use OpenAI-compatible API via aggregators
Open weights: The last open Meta model tier before Muse Spark goes closed-source

Architecture

MoE Design

Parameter	Value	Meaning
Total params	109B	Total knowledge capacity
Active params	17B	Actually used per inference
Experts	16	Routable sub-networks

The MoE advantage: 109B parameters of knowledge at the inference cost of a 17B model.

Price Comparison

Model	Input Price ($/M tokens)	Context	Open
Llama 4 Scout	$0.08	10M	Yes
GPT-5.5	$2.50	1M	No
Claude Opus 4.7	$15.00	200K	No
DeepSeek-V4-Flash	$0.14	1M	Yes

Best Use Cases

Long document QA — legal, financial, medical documents without chunking
Codebase understanding — feed entire codebase at once
Batch document processing — summaries, classification, extraction
Cost-sensitive inference — high volume, acceptable absolute accuracy

Summary

Llama 4 Scout isn’t the strongest model, but it’s the most practical — solving a real pain point (document chunking) at extremely low cost. With Meta’s open-source strategy potentially tightening, now is the time to use it.

Meta’s open-source strategy is shifting from “open everything” to “tiered openness.” Llama 4 Scout may be the last train.

Key Takeaway

Architecture

MoE Design

Price Comparison

Best Use Cases

Summary

Related

MiniMax M2.7 Deep Dive: The Model That Trains Itself

DeepSeek V4 Pro API 75% Off, Unlocks 1M Context in Claude Code / OpenClaw

Moonshot AI Announces Kimi K3: 2.5 Trillion Parameters, Targeting Global Top-Tier Models