Azure OpenAI services and rate limits

Alex Thaman 20 Reputation points
2024-09-03T19:23:32.7366667+00:00

I've deployed an Azure OpenAI service and model deployment and I'm a bit confused as to what I'm looking at in the Azure portal.

When I go to Azure OpenAI Studio and look under "Quotas", I see the following (I've obscured some of the names below):

Quota Name Resource type Usage/Limit
Tokens Per Minute (thousands) - gpt-4o 150 of 150
(a few rows with other models)
myopenai OpenAI 140
myhub12345678 AIServices 10

I just don't understand what this means - it looks like it is adding the token limit together. I've somehow deployed this service two separate ways, and I can't figure out what this is mapping to conceptually. Can someone explain what I am looking at here?

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,919 questions
0 comments No comments
{count} votes

Accepted answer
  1. Adharsh Santhanam 3,570 Reputation points
    2024-09-04T05:18:07.4166667+00:00

    Hello Alex Thaman, Quota is assigned to your subscription on a per-region, per-model basis in units of Tokens-per-Minute (TPM). Your subscription is onboarded with a default quota for most models. You can allocate TPM among deployments until reaching quota. If you exceed a model's TPM limit in a region, you can reassign quota among deployments or request a quota increase. Alternatively, if viable, consider creating a deployment in a new Azure region in the same geography as the existing one.

    For example, with a 240,000 TPM quota for GPT-35-Turbo in East US, you could create one deployment of 240K TPM, two of 120K TPM each, or multiple deployments adding up to less than 240K TPM in that region.

    There is also a limit of 30 Azure OpenAI resource instances per region. So, the one that you're referring to in the above screenshot would represent the limits in terms of number of OpenAI instances per region or the TPM. Here's a useful reference -- https://techcommunity.microsoft.com/t5/fasttrack-for-azure/optimizing-azure-openai-a-guide-to-limits-quotas-and-best/ba-p/4076268

    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.