Bottom Line
Meta has released Llama 4 Scout — a 17B active / 109B total parameter 16-expert MoE model with 10M token context window and input pricing at $0.08/M tokens. This is Meta’s last open-weight model before Muse Spark goes closed-source, meaning if you miss Scout, the next open-weight Meta model may be a long wait.
What Happened
Llama 4 Scout Core Specs
| Dimension | Specs |
|---|---|
| Architecture | 16-expert MoE |
| Total Parameters | 109B |
| Active Parameters | 17B |
| Context Window | 10M Tokens |
| Input Price | $0.08/M Tokens |
| Open Weights | ✅ (last open generation) |
| API Compatible | OpenAI-compatible format |
Key Features
10M Token Context:
- Fit a 300-page document without chunking
- 78x the capacity of GPT-5.5’s 128K context
- Game-changing for RAG, legal document analysis, codebase understanding
Extremely Low Input Price:
- $0.08/M input, an order of magnitude cheaper than most competitors
- 187-375x cheaper than GPT-5.5 input
- For large-context tasks (document analysis, code review), cost advantage is significant
Last Open Weights:
- Meta Muse Spark has shifted to closed-source
- Scout may be the last downloadable, fine-tunable, deployable Meta open-weight model for a while
Why It Matters
1. Price War in Long Context
| Model | Context | Input Price ($/M) | Architecture |
|---|---|---|---|
| Llama 4 Scout | 10M | $0.08 | 16-expert MoE |
| GPT-5.5 | 128K | $15-30 | Dense |
| Claude Opus 4.7 | 200K | $15 | Dense |
| Gemini 3.1 Pro | 1M | $3.50 | MoE |
| DeepSeek V4 | 1M | $0.14-0.55 | MoE |
Scout’s input price is 187-375x cheaper than GPT-5.5, with 78x the context window.
Actionable Advice
Who Should Pay Attention
- Long document processing: Legal, finance, academic document analysis
- Codebase understanding: Feed entire projects without chunking
- Cost control teams: Large-scale text processing on limited budget
- Open model dependents: Need open weights for fine-tuning or private deployment
How to Get Started
# Via aggregator API (OpenAI-compatible)
curl https://api.together.ai/v1/chat/completions \
-H "Authorization: Bearer $API_KEY" \
-d '{
"model": "meta-llama/Llama-4-Scout",
"messages": [{"role": "user", "content": "Analyze this 200-page contract..."}]
}'
- Hugging Face:
huggingface.co/meta-llama - Aggregators: Together AI, Groq, OpenRouter