Google Releases Gemini 3.5 Flash, Beats Gemini 3.1 Pro On Many Benchmarks

AI is accelerating so fast that Flash models are beating Pro models from just months ago.

Google DeepMind announced Gemini 3.5 Flash at Google I/O 2026 today, kicking off a new 3.5 model family with its strongest agentic model yet. The release is a pointed statement: a Flash-tier model — built for speed and cost efficiency — now outperforms Gemini 3.1 Pro, Google’s flagship that only launched in February, across coding and agentic benchmarks.

What Gemini 3.5 Flash Does

Gemini 3.5 Flash is purpose-built for agents and long-horizon tasks. It can plan and reason across massive codebases, deploy subagents to work in parallel, and sustain complex workflows over extended periods. Google says it outperforms 3.1 Pro on three key benchmarks: Terminal-Bench 2.1 (coding), GDPval-AA Elo (real-world agentic tasks), and MCP Atlas (scaled tool use).

The benchmark numbers from the image released by Google DeepMind tell the story clearly:

Coding (Terminal-Bench 2.1): 3.5 Flash scores 76.2% vs. 3.1 Pro’s 70.3%
Real-World Agentic (GDPval-AA Elo): 3.5 Flash scores 1656 vs. 3.1 Pro’s 1314
Scaled Tool Use (MCP Atlas): 3.5 Flash scores 83.6% vs. 3.1 Pro’s 78.2%

Speed is the other headline. Google CEO Sundar Pichai said at the keynote that Gemini 3.5 Flash delivers 289 tokens per second — four times faster than other frontier models. That’s not a marginal improvement; it changes the economics of running agents at scale.

Why This Matters

The GDPval-AA result deserves particular attention. When Gemini 3.1 Pro launched, it scored 1,317 on GDPval-AA — a meaningful step up from 3 Pro, but still behind Claude Sonnet 4.6, which led the field at that benchmark. Gemini 3.5 Flash’s score of 1,656 leaps well past that, signaling a step-change in Google’s agentic capabilities rather than incremental progress.

The context matters here. Gemini 3.1 Flash-Lite, released in March, was already impressive — faster and cheaper than Gemini 2.5 Flash, with thinking levels that let developers trade latency for depth. 3.5 Flash goes further, not just on efficiency but on raw capability.

The Bigger Picture

Google has been running a sustained campaign to reclaim the top of the AI stack. The Gemini 3 Deep Think mode already put the model in elite competitive programming territory. Gemini 3.1 Pro followed with top-of-chart results on the Artificial Analysis Intelligence Index at half the price of rivals. Now 3.5 Flash extends that pattern into the agentic layer — where the real enterprise value is being built.

The cadence is notable. A Pro-class model became a Flash-class benchmark result in roughly three months. If that compression continues, the distinction between “flagship” and “efficient” model tiers will keep collapsing.

Google’s Gemini traffic share has already crossed 26% of global generative AI web traffic. Gemini 3.5 Flash — fast, capable, and priced to scale — is built to push that further, particularly among developers building agent-driven applications.

Gemini 3.5 Flash is available now. The 3.5 family is expected to grow.