GPT-5.5 Codex Agent Tested: Browser Control, Computer Operations, and Autonomous Execution

When OpenAI released GPT-5.5 on April 23, it simultaneously launched Codex Agent mode. Beyond programming capabilities, the most引人注目的 feature of GPT-5.5’s Agent mode is browser control and computer operations — AI can autonomously navigate web pages, operate application interfaces, and even negotiate with customer service.

Agent Capability Overview

Core capabilities of GPT-5.5 Codex Agent:

Browser Control: With a ChatGPT Pro+ subscription and computer use enabled, the Agent can take over the browser, autonomously completing login, navigation, form filling, and other operations
Computer Operations: Directly操控 operating system interfaces without going through APIs or command lines
Real-time Decision Making: When encountering pop-ups, captchas, or page changes during operations, the Agent can autonomously judge and adjust strategies

Real-World Test Cases

Multiple real-use cases have been documented in the community:

Cancel subscription and request refund: A user asked the Agent to “log into Amazon, cancel Prime membership, and request a refund for April’s $15.89 charge.” The Agent autonomously completed:

Logged into the Amazon account
Navigated to the membership management page
Cancelled the subscription
Opened the online customer service chat
Explained the billing cycle and negotiated a refund
Successfully received a $15 refund

The entire process was completed in minutes without human intervention.

Brand meeting room background generation: At DevDay, OpenAI showcased the BrandRoom project, using Codex + GPT-5.5 and GPT Image 2 to automatically generate branded meeting backgrounds, solving the remote team video conference background problem.

Comparison with Claude Code

In Agent programming scenarios, some users report that Codex’s pricing transparency is lower than Claude Code: a 16-person engineering team considered switching from Codex to Cursor because Cursor’s token usage and pricing are more transparent and support more models like Composer 2.

However, Codex Agent’s browser control capability currently leads — Claude Code primarily focuses on operations within the coding environment, while Codex can operate browsers and a wider range of desktop applications.

A Side Note: The Goblin System Prompt

Codex’s system prompt added a new rule: it prohibits mentioning goblins, gremlins, trolls, and other “creatures” unless relevant to the task. The community discovered that GPT-5.5 previously had overreactions to the word “goblin” in Codex, prompting OpenAI to add an explicit prohibition rule. This reflects the unpredictability of Agent models in complex interactions.

Action Recommendations

Users needing browser automation: GPT-5.5 Codex Agent’s browser control is one of the most mature solutions currently available. Apply for a Pro+ subscription and enable computer use for testing
Engineering teams: If your team primarily uses programming agents rather than browser automation, compare Codex vs. Cursor/Claude Code token costs and transparency
Security considerations: Agents can operate browsers and system interfaces. Set clear permission boundaries and operational scope in production environments

Primary Sources

OpenAI Codex
OpenAI DevDay 2026
Community test reports (X/Twitter)

Agent Capability Overview

Real-World Test Cases

Comparison with Claude Code

A Side Note: The Goblin System Prompt

Action Recommendations

Primary Sources

Related

DeepSeek V4 Image Mode Rolls Out in Beta, Closing the Last Major Gap

OpenAI Workspace Agents Launch: From Personal Chat to Team Automation, ChatGPT Paradigm Shift

DeepSeek V4 Flash Review: Tool Calling Significantly Improved, Multi-Step Workflows in One Prompt