Google Eases Gemini Usage Limits After Complaints

News Room

Google’s new Gemini usage limits are getting an early adjustment after some users said complex prompts were draining their quota too quickly.

The company is revising its compute-based limit system, which measures usage based on factors such as prompt complexity, model selection, tools used, and chat length. Google says it will cap how much quota a single Gemini 3.1 Pro request can consume, make Flash-Lite prompts free, clarify that failed requests do not count, and add more detailed usage breakdowns.

Google moves away from simple prompt counts

Google introduced compute-based usage limits for Gemini after its I/O 2026 developer conference. Under the new system, Gemini usage depends on the prompt’s complexity, the model or feature used, and the chat length.

9to5Google reported that Google made adjustments after users complained about hitting limits too quickly.

Josh Woodward, vice president of Google Labs, Gemini App, and AI Studio, wrote on X, “We’ve heard your feedback about hitting limits too quickly on GeminiApp,” according to Heise.

Pro prompts get a quota cap

The main change affects complex Gemini 3.1 Pro requests, especially prompts that include large files. Those requests could consume quota too quickly under the new system.

Woodward said Google is “capping the amount of quota a single prompt can use so you get more out of the Pro model,” 9to5Google noted.

Google also clarified how failed requests are handled. “If a request fails, you won’t be charged. “Our system mistakes are on us, not you,” the company said per 9to5Google.

The clarification addresses cases where users may have seen their quota consumed while testing large files, long prompts, or more demanding Gemini features.

Google said Gemini 3.1 Flash-Lite prompts are now free and will not count against a user’s quota. That gives users a way to keep working with a lighter model while preserving quota for more demanding tasks.

More must-read AI coverage

More usage details are coming

Google plans to add more detailed usage breakdowns and notifications to help users manage their limits. The current usage dashboard at gemini.google.com/usage provides a general overview but does not yet provide granular details on which tasks consume the most quota.

The company noted that Gemini will remember a user’s selected model across future sessions. The model should change only when the user manually switches it or when a cap triggers an automatic fallback to a lighter model.

Google also fixed an issue that caused one or two Omni video generations to drain quotas for some users. Google AI Ultra users now get double the number of Omni generations, and the company said it will look for more ways to increase Omni access.

The update shows the challenge of making AI usage limits predictable as providers move beyond simple prompt counts. Usage can vary depending on whether someone asks a simple question, uploads a large file, runs Deep Research, or generates a video.

Learn how Gemini in Google Docs can help you draft, edit, summarize, format, generate images, pull sources, and listen to documents.

Read the full article here

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *