Llama 4 Scout: Meta's Last Open-Weight MoE, 10M Token Context at Just $0.08/M Input

Llama 4 Scout: Meta's Last Open-Weight MoE, 10M Token Context at Just $0.08/M Input

Bottom Line

Meta has released Llama 4 Scout — a 17B active / 109B total parameter 16-expert MoE model with 10M token context window and input pricing at $0.08/M tokens. This is Meta’s last open-weight model before Muse Spark goes closed-source, meaning if you miss Scout, the next open-weight Meta model may be a long wait.

What Happened

Llama 4 Scout Core Specs

DimensionSpecs
Architecture16-expert MoE
Total Parameters109B
Active Parameters17B
Context Window10M Tokens
Input Price$0.08/M Tokens
Open Weights✅ (last open generation)
API CompatibleOpenAI-compatible format

Key Features

10M Token Context:

  • Fit a 300-page document without chunking
  • 78x the capacity of GPT-5.5’s 128K context
  • Game-changing for RAG, legal document analysis, codebase understanding

Extremely Low Input Price:

  • $0.08/M input, an order of magnitude cheaper than most competitors
  • 187-375x cheaper than GPT-5.5 input
  • For large-context tasks (document analysis, code review), cost advantage is significant

Last Open Weights:

  • Meta Muse Spark has shifted to closed-source
  • Scout may be the last downloadable, fine-tunable, deployable Meta open-weight model for a while

Why It Matters

1. Price War in Long Context

ModelContextInput Price ($/M)Architecture
Llama 4 Scout10M$0.0816-expert MoE
GPT-5.5128K$15-30Dense
Claude Opus 4.7200K$15Dense
Gemini 3.1 Pro1M$3.50MoE
DeepSeek V41M$0.14-0.55MoE

Scout’s input price is 187-375x cheaper than GPT-5.5, with 78x the context window.

Actionable Advice

Who Should Pay Attention

  • Long document processing: Legal, finance, academic document analysis
  • Codebase understanding: Feed entire projects without chunking
  • Cost control teams: Large-scale text processing on limited budget
  • Open model dependents: Need open weights for fine-tuning or private deployment

How to Get Started

# Via aggregator API (OpenAI-compatible)
curl https://api.together.ai/v1/chat/completions \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "model": "meta-llama/Llama-4-Scout",
    "messages": [{"role": "user", "content": "Analyze this 200-page contract..."}]
  }'
  • Hugging Face: huggingface.co/meta-llama
  • Aggregators: Together AI, Groq, OpenRouter