Are Azure OpenAI Service Rate Limits Shared Between Deployments?

Arik Levy 0 Reputation points
2024-11-05T17:28:20.1633333+00:00

I have two Azure OpenAI services, one called dev and the other prod, both using the GPT-4 model deployed on UKS with different versions.

The prod service has a TPM of 35k and the dev service has a TPM of 10k.

While testing, I encountered rate limit issues on the dev service, prompting me to create a simple for loop that calls the model, asks a basic question, and waits one second between requests. The token count for these requests is negligible. Despite this, I am hitting a rate limit error after the 10th request, even though the RPM limit is set to 60.

Is it possible that the two deployments are interfering with each other in terms of rate limits?

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,227 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Arik Levy 0 Reputation points
    2024-11-06T10:43:23.0166667+00:00

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.