Updated
Updated · 9to5Google · May 28
Google Eases Gemini Limits, Makes 3.1 Flash-Lite Free as AI Ultra Omni Quota Doubles
Updated
Updated · 9to5Google · May 28

Google Eases Gemini Limits, Makes 3.1 Flash-Lite Free as AI Ultra Omni Quota Doubles

12 articles · Updated · 9to5Google · May 28
  • Google revised Gemini’s new compute-based limits after users said quotas were being exhausted too quickly, capping how much quota a single 3.1 Pro prompt can consume.
  • The changes target complex prompts with large files and heavy tools such as Deep Research, which can burn through the 5-hour refresh allowance before users reach their weekly cap.
  • 3.1 Flash-Lite prompts are now free and do not count against quota, while failed requests also no longer consume usage because only successful completions are charged.
  • Google also fixed a bug that let one or two Omni videos drain quotas for some users and doubled the number of Omni generations available to AI Ultra subscribers.
  • To reduce confusion around the compute-based system introduced at I/O 2026, Google said it will add more detailed usage breakdowns, notifications and later pay-as-you-go top-up AI credits.
How can Gemini users predict a prompt's 'compute cost' before they hit send and risk their weekly quota?
Will pay-as-you-go credits create a future where only the wealthy can afford truly powerful AI assistants?
Is Google's 'free' AI model a strategic move to outprice smaller competitors and dominate the AI market?