Gemini 2.5 Flash
Gemini 2.5 Flash is Google's first fully hybrid reasoning model, letting developers toggle thinking on or off and set thinking budgets to tune the balance between quality, cost, and latency, all on top of the fast, multimodal foundation of 2.0 Flash.
File InputReasoningTool UseVision (Image)Web SearchImplicit Caching
index.ts
import { streamText } from 'ai'
const result = streamText({ model: 'google/gemini-2.5-flash', prompt: 'Why is the sky blue?'})Latency24 hours
P50 time to first token (TTFT) on live AI Gateway traffic, in milliseconds. View the docs for more info.