GPT-5.5 Release: OpenAI Reclaims Terminal Performance Lead, Price War Intensifies

OpenAI released GPT-5.5 on April 23, its first major version upgrade since GPT-4.5. The new model scored 82.7% on Terminal-Bench 2.0, pulling ahead of Claude Opus 4.7 (69.4%) and Gemini 3.1 Pro (68.5%). This benchmark measures long tasks in command-line scenarios requiring planning, iteration, and tool coordination—precisely the area Anthropic highlighted in Opus 4.7’s launch. GPT-5.5 improved 7.6 percentage points over GPT-5.4 (75.1%).

The pricing signal is equally noteworthy. GPT-5.5’s API pricing is $5.00/M input and $30.00/M output, a significant jump from GPT-5.4 ($3.50/$18.00). Claude Opus 4.7 is priced at $5.00/$25.00, while DeepSeek V4 Pro is just $2.20/$3.48. GPT-5.5 is now the most expensive frontier model.

Key Data Comparison

Model	Input ($/M)	Output ($/M)	Terminal-Bench 2.0	Context Window
GPT-5.5	5.00	30.00	82.7% (SOTA)	200K
Claude Opus 4.7	5.00	25.00	69.4%	200K
Gemini 3.1 Pro	3.50	15.00	68.5%	1M
DeepSeek V4 Pro	2.20	3.48	Not disclosed	1M

From GPT-5.0 to GPT-5.5, OpenAI’s pricing followed a steep curve: input price rose from $0.625 to $5.00 (8x), output from $5.00 to $30.00 (6x). Community feedback suggests GPT-5.5 also consumes more tokens per task, making actual costs even higher than advertised.

Landscape Assessment

GPT-5.5’s strategy is clear: use capability advantages to offset price disadvantages. The lead on Terminal-Bench 2.0 (13 percentage points over second place) is significant, but it’s concentrated in code and terminal tasks. Community feedback on general conversation and Chinese writing is more mixed.

Meanwhile, DeepSeek V4 Pro delivers near-comparable performance at less than one-ninth the price of GPT-5.5, moving from “budget alternative” to “primary choice.” As one developer put it: “DeepSeek’s pricing is doing to enterprise software margins what Costco did.” If this trend continues, price-sensitive users migrating to open-source models could accelerate.

Action Items

Heavy terminal/code users: GPT-5.5’s Terminal-Bench advantage is real. Worth trying if your workflow relies heavily on CLI tools. Watch for unexpected token consumption.
General conversation and long text: Gemini 3.1 Pro’s 1M context and lower price remain the better value proposition.
Cost-sensitive scenarios: DeepSeek V4 Pro’s API pricing is low enough to replace GPT-5.5 in most non-frontier tasks.
Watch for Codex quota changes: Community speculation suggests OpenAI may reduce Codex subscription GPT call quotas in June. Plan usage accordingly.

Key Data Comparison

Landscape Assessment

Action Items

Sources

Related

DeepSeek V4 Image Mode Rolls Out in Beta, Closing the Last Major Gap

OpenAI Workspace Agents Launch: From Personal Chat to Team Automation, ChatGPT Paradigm Shift

DeepSeek V4 Flash Review: Tool Calling Significantly Improved, Multi-Step Workflows in One Prompt