Azure Container App returns 503 when scaling from 0 with multiple concurrent requests

Pedro Antunes 15 Reputation points
2023-12-29T18:08:18.4933333+00:00

Hi,

I'm currently facing this issue with my Azure Container App (ACA):

When the ACA receives at the same time 4 or more requests and if the app has been "Scaled to 0", some requests return status 503 while others complete just fine.

status: 503 body: upstream connect error or disconnect/reset before headers. reset reason: connection termination

However, if ACA is already up, and I make the same requests (or even more), ACA works as expected and scales as expected.

 

This only happens if I make at least 4 requests at the same time. If I make 3 or less, ACA starts and works as expected too.

 

This even happens even if you make requests to an endpoint that does not exist inside the app (in this case it is a NodeJS express server).

 

I found this issue on Github that is relatable with my problem, but I don't know if  there are any updates:

https://github.com/microsoft/azure-container-apps/issues/295

Azure Container Apps
Azure Container Apps
An Azure service that provides a general-purpose, serverless container platform.
324 questions
{count} vote

3 answers

Sort by: Most helpful
  1. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

  2. Ryan Hill 26,866 Reputation points Microsoft Employee
    2024-01-04T20:33:51.0566667+00:00

    Hey @Pedro Antunes

    Thanks for clarifying above. Based on what you've provided, definitely sounds like this is a cold start scenario where your app isn't completely ready to accept requests yet. It is recommended that you have at least 1 replica to avoid these sorts of issues, but I understand you want to scale from 0.

    Having said that, if you're only experiencing this issue when you're starting with 0 replicas and the transition from activating to running is in the tune of minutes with a low number of concurrent requests, then we would need to work more closely with you to investigate further, just comment down below. However, if the startup time is minimal and response time is paramount for your application, then you may need to consider running at least 1 replica to avoid this.


  3. Ryan Hill 26,866 Reputation points Microsoft Employee
    2024-02-13T14:15:15.1633333+00:00

    @Pedro Antunes thanks for reaching out via offline.

    The team has determined that the issue was that private certs weren't being respected when activating the pod. Once this patch was applied to your cluster, your issue was resolved.

    For others that were affected by this issue, the patch is in the latest platform rollout and should resolve any incidents. However, if you continue to see an issue, please let me know by commenting on this post.