GPT-5.5 Codex Agent Tested: Browser Control, Computer Operations, and Autonomous Execution

GPT-5.5 Codex Agent Tested: Browser Control, Computer Operations, and Autonomous Execution

When OpenAI released GPT-5.5 on April 23, it simultaneously launched Codex Agent mode. Beyond programming capabilities, the most引人注目的 feature of GPT-5.5’s Agent mode is browser control and computer operations — AI can autonomously navigate web pages, operate application interfaces, and even negotiate with customer service.

Agent Capability Overview

Core capabilities of GPT-5.5 Codex Agent:

  • Browser Control: With a ChatGPT Pro+ subscription and computer use enabled, the Agent can take over the browser, autonomously completing login, navigation, form filling, and other operations
  • Computer Operations: Directly操控 operating system interfaces without going through APIs or command lines
  • Real-time Decision Making: When encountering pop-ups, captchas, or page changes during operations, the Agent can autonomously judge and adjust strategies

Real-World Test Cases

Multiple real-use cases have been documented in the community:

Cancel subscription and request refund: A user asked the Agent to “log into Amazon, cancel Prime membership, and request a refund for April’s $15.89 charge.” The Agent autonomously completed:

  1. Logged into the Amazon account
  2. Navigated to the membership management page
  3. Cancelled the subscription
  4. Opened the online customer service chat
  5. Explained the billing cycle and negotiated a refund
  6. Successfully received a $15 refund

The entire process was completed in minutes without human intervention.

Brand meeting room background generation: At DevDay, OpenAI showcased the BrandRoom project, using Codex + GPT-5.5 and GPT Image 2 to automatically generate branded meeting backgrounds, solving the remote team video conference background problem.

Comparison with Claude Code

In Agent programming scenarios, some users report that Codex’s pricing transparency is lower than Claude Code: a 16-person engineering team considered switching from Codex to Cursor because Cursor’s token usage and pricing are more transparent and support more models like Composer 2.

However, Codex Agent’s browser control capability currently leads — Claude Code primarily focuses on operations within the coding environment, while Codex can operate browsers and a wider range of desktop applications.

A Side Note: The Goblin System Prompt

Codex’s system prompt added a new rule: it prohibits mentioning goblins, gremlins, trolls, and other “creatures” unless relevant to the task. The community discovered that GPT-5.5 previously had overreactions to the word “goblin” in Codex, prompting OpenAI to add an explicit prohibition rule. This reflects the unpredictability of Agent models in complex interactions.

Action Recommendations

  • Users needing browser automation: GPT-5.5 Codex Agent’s browser control is one of the most mature solutions currently available. Apply for a Pro+ subscription and enable computer use for testing
  • Engineering teams: If your team primarily uses programming agents rather than browser automation, compare Codex vs. Cursor/Claude Code token costs and transparency
  • Security considerations: Agents can operate browsers and system interfaces. Set clear permission boundaries and operational scope in production environments

Primary Sources