Azure Container Instance TCP connection interrupts

David Drápela 75 Reputation points
2023-08-10T11:19:28.6433333+00:00

Hi,

I have migrated a service that was previously running on on-premise servers to Azure. The app is based on .NET and I have containerized it in Docker. It is just a single container with no scaling. I have deployed the container to ACI. The app includes two TCP listeners using custom, non HTTP protocols. The TCP listeners are used for communication with desktop clients and devices.

To deal with the fact that ACI can get a new IP after restart and using FQDN instead of the ip does help with the fact that after each restart the FQDN still resolves to the old IP for a while.

With the help in another question (https://video2.skills-academy.com/en-us/answers/questions/1343527/azure-container-instances-and-public-access), I have managed to redeploy the container to a VNet and then opened the ports for public access via a Public Load Balancer. The traffic is routed from the frontend to the service either via load balancing rules or NAT rules (both solutions work because there is just 1 instance of the service). This works really well and I imagine I can put more services behind the load balancer in the future.

However, the problem is that if I create a TCP connection to the service via any of the two ports the connection gets periodically interrupted every few minutes. This is an issue for me, the connections need to be long lived so the devices and the desktop clients work as intended. I have tried reconfiguring the idle timeout for both public IP resource and all inbound rules from default 4 minutes to 30 minutes, but that did not help. The connection gets interrupted after seemingly random intervals - sometimes 2 minutes, sometimes 5... The interesting fact is that if there are multiple connections to the service they get interrupted at the same time.

This problem did not happen when the container was deployed with public network configuration and the connections were made directly via currently assigned IP or FQDN. Another interesting fact is that the service uses Azure Database for Postgresql (flexible) with public access and some of the connections in the pool also experience interrupts.

Thanks for any suggestions.

The error on the server side: Unable to read data from the transport connection: Connection timed out.: at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken) at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)

The error on the client side: Unable to read data from the transport connection: An existing connection was forcibly closed ( v System.Net.Security._SslStream.EndRead(IAsyncResult asyncResult) at System.Net.Security.SslStream.EndRead(IAsyncResult asyncResult)

The tcp listener in the example uses SslStream, but the listener on the second port uses insecure TcpListener and it's the same problem.

Azure Container Instances
Azure Container Instances
An Azure service that provides customers with a serverless container experience.
669 questions
Azure Virtual Network
Azure Virtual Network
An Azure networking service that is used to provision private networks and optionally to connect to on-premises datacenters.
2,263 questions
Azure Load Balancer
Azure Load Balancer
An Azure service that delivers high availability and network performance to applications.
420 questions
0 comments No comments
{count} votes

Accepted answer
  1. Carlos Villagomez 1,106 Reputation points Microsoft Employee
    2023-08-24T18:13:00.2133333+00:00

    Hi @David Drápela,

    I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to "Accept " the answer.

    Issue:
    Customer has migrated a service that was previously running on-premise to Azure where the app is based on .NET and containerized in Docker. It is just a single container with no scaling, they have deployed the container to ACI and the app includes two TCP listeners using custom, non-HTTP protocols. If they create a TCP connection to the service via any of the 2 ports the connection gets periodically interrupted every few minutes and the connection gets seemingly interrupted at random intervals and if there are multiple connections to the service they get interrupted at the same time.

    The error on the server side:

    Unable to read data from the transport connection: Connection timed out.: at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken) at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
    

    The error on the client side:

    Unable to read data from the transport connection: An existing connection was forcibly closed ( v System.Net.Security._SslStream.EndRead(IAsyncResult asyncResult) at System.Net.Security.SslStream.EndRead(IAsyncResult asyncResult)
    

    The TCP listener in the example uses SSlStream, but the listener on the second port uses insecure TcpListener and it's the same problem.

    Solution:
    OP performed migration from ACI containers to ACA and verified its now working properly.

    If you have any other questions or are still running into more issues, please let me know.
    Thank you again for your time and patience throughout this issue.

    Please remember to "Accept Answer" if any answer/reply helped, so that others in the community facing similar issues can easily find the solution.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. David Drápela 75 Reputation points
    2023-08-24T10:58:06.57+00:00

    I have solved the problem by migrating from ACI containers to ACA and it works fine. I believe it is better suited for what I need.

    1 person found this answer helpful.
    0 comments No comments