Google launches Gemini 3.1 Flash-Lite, a faster and cheaper AI model

.cloud
04.03.'26 13:15
2 min

Joachim Cruysberghs

Google launches Gemini 3.1 Flash-Lite, a faster and cheaper AI model — Source: Google

According to Google, 3.1 Flash is designed with speed as its focus.

Google has announced Gemini 3.1 Flash-Lite, a new multimodal AI model variant focusing on speed and low costs for large-scale applications.

Cheaper than other Gemini models

According to a blog post from Google, Gemini 3.1 Flash-Lite is significantly cheaper than other models in the Gemini range. The model costs $0.25 per million input tokens and $1.50 per million output tokens. By comparison, Gemini 3.1 Pro, Google’s most powerful model, starts at $2 per million input tokens and $18 per million output tokens.

The model is also faster. In internal tests, Flash-Lite generated responses 45% faster than Gemini 2.5 Flash, while the time to the first output token is reportedly 2.5 times shorter.

Aimed at large-scale tasks

Gemini 3.1 Flash-Lite can process multimodal prompts up to 1 million tokens and generate responses up to 64,000 text tokens. The model can also generate code, for example, to build dashboards or other visual applications.

Google expects developers to use the model primarily for high-volume tasks with limited reasoning requirements. Examples include translating product catalogs or automatically moderating content on e-commerce platforms.

Benchmark results

In eleven benchmark tests, Flash-Lite achieved the highest score in six tests, beating GPT-5 mini and Claude 4.5 Haiku, among others.

The model achieved a good score on GPAQ Diamond, a benchmark with doctoral-level questions. On the heavy HLA benchmark, it scored 16%, compared to 44.4% for Gemini 3.1 Pro.

Gemini 3.1 Flash-Lite is currently available in preview via Vertex AI and Google AI Studio.

featured

OpenAI secures guarantees Anthropic sought in messy Pentagon contract takeover

.business
03.03.'26
5 min

recently in cloud

OpenAI makes GPT-5.3 Instant the standard in ChatGPT

.cloud
04.03.'26
2 min

Genetec launches Cloudlink 2210: large-scale cloud-managed physical security

.cloud
02.03.'26
3 min

Perplexity AI launches Perplexity Computer for Max subscribers

.cloud
02.03.'26
2 min

more cloud

poll

"*" indicates required fields

round table

Data 2025

16.12.'25
5 min

NIS2 2025

.security
06.10.'25
5 min

more round tables

events

CS4CA

10/03/2025

Embedded World

10/03/2026

Experience SAP S/4HANA in Action – Simulation Game

19/03/2026

more events

Itdaily - Google launches Gemini 3.1 Flash-Lite, a faster and cheaper AI model

Cheaper than other Gemini models

Aimed at large-scale tasks

Benchmark results