Google Eases Gemini Limits, Makes 3.1 Flash-Lite Free as AI Ultra Omni Quota Doubles
Updated
Updated · 9to5Google · May 28
Google Eases Gemini Limits, Makes 3.1 Flash-Lite Free as AI Ultra Omni Quota Doubles
12 articles · Updated · 9to5Google · May 28
Google revised Gemini’s new compute-based limits after users said quotas were being exhausted too quickly, capping how much quota a single 3.1 Pro prompt can consume.
The changes target complex prompts with large files and heavy tools such as Deep Research, which can burn through the 5-hour refresh allowance before users reach their weekly cap.
3.1 Flash-Lite prompts are now free and do not count against quota, while failed requests also no longer consume usage because only successful completions are charged.
Google also fixed a bug that let one or two Omni videos drain quotas for some users and doubled the number of Omni generations available to AI Ultra subscribers.
To reduce confusion around the compute-based system introduced at I/O 2026, Google said it will add more detailed usage breakdowns, notifications and later pay-as-you-go top-up AI credits.
How can Gemini users predict a prompt's 'compute cost' before they hit send and risk their weekly quota?
Will pay-as-you-go credits create a future where only the wealthy can afford truly powerful AI assistants?
Is Google's 'free' AI model a strategic move to outprice smaller competitors and dominate the AI market?