Core Conclusion
The Steel team updated their Agent Cookbook on May 4, implementing the same task with the same set of tools across eight major agent frameworks. This “controlled variable” comparison approach is the most fair horizontal framework evaluation available, providing developers with direct reference for framework selection.
What Happened
Steel’s Cookbook covers eight frameworks:
| Framework | Language | Core Positioning | Characteristics |
|---|---|---|---|
| LangChain | Python/JS | General AI application framework | Largest ecosystem, most comprehensive docs, moderate learning curve |
| Mastra | TypeScript | Full-stack AI framework | Built-in workflows, RAG, agent orchestration, TypeScript native |
| Pydantic AI | Python | Type-safe AI applications | Uses Pydantic for structured output and validation |
| Vercel AI SDK | TypeScript | Frontend AI integration | Streaming responses, UI components, deep Next.js integration |
| Anthropic Agent SDK | Python/JS | Claude-native agent | Deeply optimized for Claude tool calls and long context |
| OpenAI Agent SDK | Python | OpenAI-native agent | Deeply optimized for GPT tool calls and function calling |
| LlamaIndex | Python | RAG-specific framework | Strongest data indexing and retrieval capabilities |
| CrewAI | Python | Multi-agent orchestration | Role division, task delegation, collaborative workflows |
The value of this Cookbook lies in eliminating variables — same task, same tool definitions, same model calls. The only difference is the framework API and architectural pattern. This allows direct comparison of code lines, implementation complexity, and readability.
Framework Selection Guide
Scenario 1: Rapid Prototyping
If you need a working agent prototype within a day:
| Priority | Framework | Reason |
|---|---|---|
| 1 | Vercel AI SDK | Seamless Next.js integration, UI + Agent in one |
| 2 | LangChain | Rich documentation, many examples, easy to search for community answers |
| 3 | Mastra | TypeScript full-stack, built-in workflow engine |
Scenario 2: Production-Grade Agent System
If you need to deploy to production for long-term maintenance:
| Priority | Framework | Reason |
|---|---|---|
| 1 | Anthropic Agent SDK | If using Claude, this is the optimal choice (lowest tool call latency) |
| 2 | OpenAI Agent SDK | If using GPT, this is the optimal choice (most stable function calling) |
| 3 | Pydantic AI | Type-safe, suitable for scenarios with strict output format requirements |
Scenario 3: Multi-Agent Collaboration
If your scenario requires multiple agents to work together:
| Priority | Framework | Reason |
|---|---|---|
| 1 | CrewAI | Designed specifically for multi-agent collaboration, most complete role/task/process abstractions |
| 2 | Mastra | Built-in workflow orchestration, supports parallel and serial execution |
| 3 | LangGraph (LangChain) | State graph approach for multi-agent orchestration, flexible but high learning curve |
Code Complexity Comparison
Based on Steel Cookbook implementations, estimated code lines for the same task across eight frameworks:
| Framework | Code Lines | Config Complexity | Onboarding Difficulty |
|---|---|---|---|
| Vercel AI SDK | ~50 lines | Low | ⭐ |
| LangChain | ~80 lines | Medium | ⭐⭐ |
| Mastra | ~60 lines | Low | ⭐⭐ |
| Pydantic AI | ~70 lines | Medium | ⭐⭐ |
| Anthropic Agent SDK | ~45 lines | Low | ⭐ |
| OpenAI Agent SDK | ~45 lines | Low | ⭐ |
| LlamaIndex | ~100 lines | High | ⭐⭐⭐ |
| CrewAI | ~90 lines | Medium | ⭐⭐ |
Key Finding: Model-native SDKs (Anthropic/OpenAI) have the least code because they skip the cross-model abstraction layer. But if your system needs to switch models, LangChain or Mastra’s cross-model abstraction is more valuable.
Landscape Assessment
Agent frameworks in 2026 are splitting into two directions:
- Model-native camp: Anthropic Agent SDK, OpenAI Agent SDK — deeply tied to a single model, pursuing ultimate performance and developer experience
- Cross-model camp: LangChain, Mastra, Vercel AI SDK — providing model abstraction layers, pursuing flexibility and portability
Which direction to choose depends on your business needs:
- If your product deeply relies on a specific model’s capabilities (like Claude’s long context), choose the native SDK
- If you need flexible model switching or multi-model A/B testing, choose cross-model frameworks
Action Recommendations
| Role | Recommendation |
|---|---|
| New Developers | Start with Steel Cookbook, look at 2-3 framework implementations, feel the different API styles before deciding |
| Technical Selection | Don’t be hijacked by “largest ecosystem.” LangChain’s large ecosystem doesn’t mean it fits your scenario. Look at code complexity and maintenance cost |
| Team Leaders | Unifying your team’s framework selection is more important than pursuing the “best framework.” Framework switching costs are much higher than expected |