Google I/O 2026: Gemini 3.5 Flash Launches with Agent Focus and 4x Inference Speed

At the Google I/O 2026 keynote, DeepMind CTO Koray Kavukcuoglu and Jeff Dean took the stage together to announce something not entirely unexpected but still worth paying attention to: the Gemini 3.5 family is here. First up is 3.5 Flash, with 3.5 Pro already running internally and due next month.

3.5 Flash's positioning is clear—it's not here to chase leaderboard numbers. It's built for agents.

The numbers first

Terminal-Bench 2.1 hits 76.2%, GDPval-AA scores 1656 Elo, MCP Atlas reaches 83.6%. On CharXiv Reasoning for multimodal understanding, it scores 84.2%. What does this mean in plain terms? On coding and agent tasks, 3.5 Flash outperforms Google's own previous-generation 3.1 Pro.

Speed is the other selling point. Google claims output token speed is 4x faster than other frontier models. I haven't independently verified that number, but if it holds up, the latency tradeoff is very acceptable for agent workflows that require heavy back-and-forth interaction.

Agentic is the core

The biggest change in 3.5 Flash isn't "smarter"—it's "more resilient." Long-horizon multi-step task execution is the focus this time. Google demonstrated several cases:

Two subagents synthesizing the AlphaZero paper and coding a playable game from scratch in 6 hours
Migrating a messy legacy codebase to Next.js
A bank using it to automatically process 100+ page compliance documents

These are all tasks that require sustained planning, execution, checking, and iteration. Previous models tended to drift off course or give up midway on these kinds of jobs. 3.5 Flash is designed to reduce that failure rate.

Paired with Antigravity (Google's new agent-first development platform), 3.5 Flash can orchestrate multiple subagents working in parallel. The approach is similar in spirit to Anthropic's computer use and OpenAI's operator, but Google's execution is more engineering-focused—directly integrated into AI Studio and Android Studio.

What about pricing?

Google mentions that 3.5 Flash costs "less than half of other frontier models" but hasn't published a specific API pricing table yet. That detail needs to be confirmed in follow-up documentation. If the price really is cut in half, combined with 4x speed, the migration case for developers already running agent frameworks is solid.

What to watch

3.5 Flash is already live in the Gemini App, Google Search AI mode, Gemini API, AI Studio, and Android Studio. Enterprise users can access it through the Gemini Enterprise Agent Platform.

But I'd recommend waiting for 3.5 Pro before making a final call. The Flash line has always been the value route—Pro is where the real frontier capabilities live. If Pro can push another tier above Flash's speed, then Google's I/O wave genuinely delivered a competitive hand.

Main sources:

The numbers first

Agentic is the core

What about pricing?

What to watch

Related

Chrome DevTools Officially Releases MCP Server: AI Coding Agents Can Finally "See" the Browser

Google I/O 2026: The "Agentification" of Search Isn't an Upgrade, It's a Rewrite

Google's SynthID Watermarking Technology Adopted by Giants Like OpenAI and Nvidia: AI Content Provenance Enters the Standardization Era