Azure OpenAI services and rate limits

Question

I've deployed an Azure OpenAI service and model deployment and I'm a bit confused as to what I'm looking at in the Azure portal.

When I go to Azure OpenAI Studio and look under "Quotas", I see the following (I've obscured some of the names below):

Quota Name	Resource type	Usage/Limit
Tokens Per Minute (thousands) - gpt-4o		150 of 150
(a few rows with other models)
myopenai	OpenAI	140
myhub12345678	AIServices	10

I just don't understand what this means - it looks like it is adding the token limit together. I've somehow deployed this service two separate ways, and I can't figure out what this is mapping to conceptually. Can someone explain what I am looking at here?

Accepted Answer

Hello Alex Thaman, Quota is assigned to your subscription on a per-region, per-model basis in units of Tokens-per-Minute (TPM). Your subscription is onboarded with a default quota for most models. You can allocate TPM among deployments until reaching quota. If you exceed a model's TPM limit in a region, you can reassign quota among deployments or request a quota increase. Alternatively, if viable, consider creating a deployment in a new Azure region in the same geography as the existing one.

For example, with a 240,000 TPM quota for GPT-35-Turbo in East US, you could create one deployment of 240K TPM, two of 120K TPM each, or multiple deployments adding up to less than 240K TPM in that region.

There is also a limit of 30 Azure OpenAI resource instances per region. So, the one that you're referring to in the above screenshot would represent the limits in terms of number of OpenAI instances per region or the TPM. Here's a useful reference -- https://techcommunity.microsoft.com/t5/fasttrack-for-azure/optimizing-azure-openai-a-guide-to-limits-quotas-and-best/ba-p/4076268

Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

Share via

Azure OpenAI services and rate limits

0 additional answers

Your answer