Gemini 3.5 Flash: the new default model is four times faster and beats 3.1 Pro on code

On the Mountain View stage, Sundar Pichai introduced the first model in the Gemini 3.5 family. The detail that matters: Flash beats Gemini 3.1 Pro on agentic and coding benchmarks at a fraction of the cost.

Gemini 3.5 Flash is the new default model in AI Mode on Search and went GA on May 19. It's the first of the 3.5 family, and it's the classic Google move: Flash first, Pro later. Except here, Flash — based on the data Google shared — beats Gemini 3.1 Pro on a good chunk of the agentic and coding benchmarks.

The numbers

76.2% on Terminal-Bench 2.1, 83.6% on MCP Atlas, 84.2% on CharXiv Reasoning. On GDPval-AA — the real-work benchmark Google built — Flash scores 1,656, while 3.1 Pro stopped at 1,317. Pichai cited a speed figure on stage: 289 tokens per second, four times faster than other frontier models. API pricing is $1.50 per million input tokens and $9 per million output tokens, with a 1M-token context window.

What it's for

Google isn't presenting it as a better chatbot: it's presenting it as a model built to act. Planning across large codebases, deploying sub-agents that work in parallel, sustaining complex long-horizon workflows. It's the model running under Antigravity, under Spark, under the new agentic features in Search.

Why it matters

A Flash model that beats the previous generation's Pro — at roughly a tenth of the cost — is the signal that Google has stopped treating the Flash tier as a compromise. It's the model Google is betting on for the agentic phase of the product.

← Back to all announcements