AI and neural networks

Google unveiled Gemini 2.5 Flash, a fast and cheap AI model

Google unveiled Gemini 2.5 Flash, a fast and cheap AI model

At the Google Cloud Next conference, the company officially unveiled Gemini 2.5 Flash -an improved and optimized version of its AI model that can run faster, cheaper, and with more precise management of computing resources. This is Google’s next step toward commercially viable generative AI, following the high-profile premiere of an experimental version of Gemini 2.5 Pro last month.

Now Google is moving from experimentation to actually implementing next-generation models in its product ecosystem, from its developer platform Vertex AI to end users in the Gemini app.

What is Gemini 2.5 Flash and why is it needed

Gemini 2.5 Flash is based on the same code as 2.5 Pro, but optimized for lighter tasks. This version provides faster responses to simple queries while consuming fewer resources. This approach lowers the cost of using AI – for both Google and customers.

While Flash is not yet available in Gemini’s custom app, it is already being deployed in Vertex AI -Google’s cloud-based platform for enterprise and development solutions. The Flash model is primarily focused on automation, chatbots, analytics assistants, and other scenarios where speed and cost-efficiency are important.

“Dynamic Thinking” has become manageable

One of the key features of Gemini 2.5 was the introduction of “dynamic thinking”  (dynamic thinking) – models themselves determine how much “mental effort” to put into a solution depending on the complexity of the query. In the Flash version, this feature is now more flexible and controllable.

Developers now have access to the “thinking budget” -you can set how deeply the model should analyze a query, thereby balancing speed, accuracy, and response cost. This is especially important in commercial applications where every millisecond and every penny counts.

In the coming weeks, Google will also be adding contextual caching and controlled model fine-tuning to Vertex AI to make it even more adaptable to specific tasks.

Deep Research is now live on Gemini 2.5 Pro

As Flash launches, the older Gemini 2.5 Pro model has a new purpose -it now powers the Deep Research tool that previously ran on version 2.0. Deep Research allows the user to specify a topic and receive a detailed, generated report gathered from relevant sources on the web.

With the move to Gemini 2.5 Pro, the quality and accuracy of responses has improved dramatically. According to Google, in user tests, reports from Gemini 2.5 Pro receive more than twice as many favorable ratings as those from OpenAI models.

According to Google, Gemini 2.5 Pro’s reports receive more than twice as many favorable ratings as those from OpenAI models.

Albeit, access to this version of Deep Research is limited: full features are only available to subscribers of Gemini Advanced, while free users get a stripped-down version.

Where this is going

The Gemini 2.5 series of updates signals the beginning of a new phase – introducing AI that can work efficiently and cost-effectively in real-world tasks. Until now, generative AI has been expensive, but now Google is starting to bring costs down with lighter models and new TPU processors.

And now Google is starting to bring costs down with lighter models and new TPU processors.

Moving all Gemini products to the 2.5 branch is only a matter of time. With this approach, Google isn’t betting on abstract benchmarks, but on practicality, manageability, and business relevance.

Subscribe
Notify of
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

You may also like