Gemini 3.1 Flash-Lite: Built for intelligence at scale

Today, we're introducing Gemini 3.1 Flash-Lite, our fastest and most cost-efficient Gemini 3 series model. Built for high-volume developer workloads at scale, 3.1 Flash-Lite delivers high quality for its price and model tier. Starting today, 3.1 Flash-Lite is rolling out in preview to developers via the Gemini API in Google AI Studio and for enterprises via Vertex AI . Cost-efficiency without compromise Priced at just $0.25/1M input tokens and $1.50/1M output tokens, 3.1 Flash-Lite delivers enhanced performance at a fraction of the cost of larger models. It outperforms 2.5 Flash with a 2.5X faster Time to First Answer Token and 45% increase in output speed, according to the Artificial Analysis benchmark while maintaining similar or better quality. This low latency is needed for high-frequency workflows, making it an ideal model for developers to build responsive, real-time experiences. Gemini 3.1 Flash-Lite outperforms 2.5 Flash in speed and quality. 3.1 Flash-Lite achieves an impressi

Gemini 3.1 Flash-Lite: Built for intelligence at scale

Related Articles

Stop Using Channels for Everything

The Better Way to Configure Entity Framework Core

Microsoft’s big developer conference returns to San Francisco in June

EA continues to ‘evolve’ The Sims 4 with new virtual currency and a ‘maker’ program

OSS Pull Request Therapy: Learning to Enjoy Code Reviews with npmx

Related Articles

How-To
Stop Using Channels for Everything
Medium Programming • 9h ago

How-To
The Better Way to Configure Entity Framework Core
Medium Programming • 11h ago

How-To
Microsoft’s big developer conference returns to San Francisco in June
The Verge • 12h ago

How-To
EA continues to ‘evolve’ The Sims 4 with new virtual currency and a ‘maker’ program
The Verge • 13h ago

How-To
OSS Pull Request Therapy: Learning to Enjoy Code Reviews with npmx
FreeCodeCamp • 14h ago