Gemini 3.5 Flash is the new default model in AI Mode on Search and went GA on May 19. It's the first of the 3.5 family, and it's the classic Google move: Flash first, Pro later. Except here, Flash — based on the data Google shared — beats Gemini 3.1 Pro on a good chunk of the agentic and coding benchmarks.
The numbers
76.2% on Terminal-Bench 2.1, 83.6% on MCP Atlas, 84.2% on CharXiv Reasoning. On GDPval-AA — the real-work benchmark Google built — Flash scores 1,656, while 3.1 Pro stopped at 1,317. Pichai cited a speed figure on stage: 289 tokens per second, four times faster than other frontier models. API pricing is $1.50 per million input tokens and $9 per million output tokens, with a 1M-token context window.
What it's for
Google isn't presenting it as a better chatbot: it's presenting it as a model built to act. Planning across large codebases, deploying sub-agents that work in parallel, sustaining complex long-horizon workflows. It's the model running under Antigravity, under Spark, under the new agentic features in Search.
Why it matters
A Flash model that beats the previous generation's Pro — at roughly a tenth of the cost — is the signal that Google has stopped treating the Flash tier as a compromise. It's the model Google is betting on for the agentic phase of the product.