Browse by topic

How AI infrastructure is evolving

Models, chips, and business models: Google chooses where to make margin

Eight articles tell the industrial spine of the event. The Gemini 3.5 family ships with three converging announcements: Gemini 3.5 Flash as the new default that beats Gemini 3.1 Pro on coding and agentic benchmarks at a fraction of the cost; Gemini Omni as the multimodal model that produces video from text, images, and audio; Gemma 4 as the open-weight front with four sizes. The Neural Expressive redesign of the Gemini app closes the consumer thread. On the chip front, Google explicitly splits training and inference for the first time: TPU 8t for frontier models, TPU 8i for agents. On the business-model front, three announcements say the same thing: value isn't in the models, it's in the infrastructure that runs them. AI Ultra drops to $200, the Gemini API moves from prompts per day to compute consumed, AI Pro includes YouTube Premium Lite. Google is building its margin on inference.

Articles

In this area (8)

Google AI Pro now bundles YouTube Premium Lite: the quiet I/O 2026 deal

Buried in the Gemini plan reshuffle is a detail worth nine dollars a month: anyone paying for AI Pro at $19.99 now gets YouTube Premium Lite included. A retention move dressed up as a gift.

2026-05-21T02:10:00+02:00 Read →

How AI infrastructure is evolving

Neural Expressive: the Gemini app redesign ditches the wall of text

Google rewrites Gemini's visual language with fluid animation, fresh typography, and responses that structure information instead of dumping it. Rolling out now on Android, iOS, and the web.

2026-05-20T09:30:00+02:00 Read →

How AI infrastructure is evolving

Gemini changes its metric: from daily prompts to compute consumed (with automatic fallback to smaller models)

Google drops daily prompt caps and introduces a metric based on compute actually consumed. Quota refreshes every five hours, weekly ceiling, and if you hit it you get rerouted to smaller models instead of being blocked.

2026-05-20T09:20:00+02:00 Read →

How AI infrastructure is evolving

Gemma 4: Google's open model family gets refreshed, from E2B to 31B

At I/O 2026 Google announced Gemma 4, the new generation of its open-weight models built for on-device or self-hosted deployment. Four sizes from E2B to 31B, gains in code generation and instruction following, and a 27B variant optimized for 4-bit inference on consumer-grade hardware.

2026-05-20T08:10:00+02:00 Read →

How AI infrastructure is evolving

TPU 8t and TPU 8i: the eighth generation splits in two and Google stops shipping one chip for everything

At the I/O 2026 developer keynote Google reiterated its new eighth-generation TPUs: two distinct chips, one for training frontier models (8t), one for serving agents (8i). It's the first time Google splits the two workloads.

2026-05-20T06:05:00+02:00 Read →

How AI infrastructure is evolving

Gemini 3.5 Flash: the new default model is four times faster and beats 3.1 Pro on code

On the Mountain View stage, Sundar Pichai introduced the first model in the Gemini 3.5 family. The detail that matters: Flash beats Gemini 3.1 Pro on agentic and coding benchmarks at a fraction of the cost.

2026-05-19T23:00:00+02:00 Read →

How AI infrastructure is evolving

Google AI Ultra drops from $250 to $200/month (and a new $100 plan for developers launches)

Positioning move: Google cuts $50 from the top plan and introduces a $100/month mid-tier plan aimed at developers and knowledge workers with five times the Pro plan's limits.

2026-05-19T20:45:00+02:00 Read →

How AI infrastructure is evolving

Gemini Omni: Google's new multimodal model turns text, images, and audio into video

Announced by Sundar Pichai on the Mountain View stage, Gemini Omni is the model that reasons across all media. The first member of the family, Gemini Omni Flash, will arrive this summer on the Gemini app, YouTube Shorts, and Flow.

2026-05-19T19:30:00+02:00 Read →

Browse other areas