Understanding “Connection forcibly closed by remote host” Errors Caused by TOE/Chimney

Sporadic “Connection forcibly closed by remote host” errors with SQL Server connections can be very difficult to troubleshoot and resolve. This blog post is targeted at diagnosing TOE/Chimney issues that may lead to this client error message. Chimney is a feature introduced in the Windows Server 2003 Scalable Networking Pack, which was included in Windows Server 2003 SP2. Chimney increases network performance when using a network card which implements TOE, TCP/IP Offload Engine, which is a hardware implementation of the TCP/IP stack.

The following are the symptoms to look for:

· The client connection is sporadically failing with the message: “TCP Provider: Connection forcibly closed by remote host.” The client connection may, in addition, sometimes fail with the message: “General network error”.

· There are no corresponding network-related error messages in the SQL Server instance’s ERRORLOGs. Normally, the “Connection forcibly closed by remote host” message on the client indicates that an error occurred on the server which is deemed severe enough to close the connection; in that case, the server would log an error message explaining why the connection was closed. An example error message for this would be Error 17828: “The prelogin packet used to open the connection is structurally invalid; the connection has been closed. Please contact the vendor of the client library.” However, if the issue is in the networking hardware, such as a TOE-related issue, there will be no message in the SQL Server instance’s ERRORLOGs for this connection closure, since the server is not intentionally closing the connection. Therefore, check the SQL Server ERRORLOG for an absence of any corresponding network-related error messages.

· There is no other client killing the first client’s connection. In addition to potential network hardware causes, the “Connection forcibly closed” message can also appear with no corresponding server ERRORLOG message if the client’s connection is being killed by a different client. Examine the SQL Server ERRORLOG for KILL statements; if there are none, then no other client is killing SQL Server connections.

If all three of these symptoms are appearing, your problem is likely due to a faulty piece of network hardware, possibly due to TOE/Chimney.

To test if TOE/Chimney is the source of your problem, you can disable it and see if the problem goes away. You should do this for BOTH the client and server, since TOE/Chimney on either machine, or both, could be the cause of the issue. To disable Chimney, run this command (if on Windows Vista or Windows Server 2008, run it at an elevated command prompt):

netsh int ip set chimney DISABLED

This command does NOT require a reboot. If you have these symptoms and running this command doesn’t fix the problem, then you likely have an issue with network hardware and should follow up by investigating your network hardware. This kb article should give you some leads on how to begin network troubleshooting: https://support.microsoft.com/kb/325487

Dan Benediktson
SQL Server Protocols
Disclaimer: This posting is provided "AS IS" with no warranties, and confers no rights

Comments

  • Anonymous
    January 07, 2009
    You nicely explain for which symptoms one has to look out, however I'm still looking for the reason why this happens. What causes exactly the connections to drop (forced closure) ? Is the TOE implementation so buggy or are their certain szenarios that forces the reset of the tcp connections ?

  • Anonymous
    January 07, 2009
    I actually need to update this blog post, so thanks for reminding me!  The root cause of this exact problem is a buggy implementation of keepalive in the NIC driver.  Fortunately, in the latest driver version that is available for the affected NICs, that implementation has been fixed, so now rather than turning off this feature entirely, you can update your NIC drivers and get the benefits of TOE. Hope this helps, Dan

  • Anonymous
    January 07, 2009
    Thanks for the very fast response, in our situation we see that behavior between two servers that are both running on VM Ware ESX. Would the mentioned cause also apply for a virtualisation environment like this ? Is there a KB that gives an overview which NICs are affected ?

  • Anonymous
    January 07, 2009
    I believe the affected NICs are covered in this KB article: http://support.microsoft.com/kb/942861 I am not positive that this is possible in a VMWare environment, although I believe it should be possible if one or both of the host machines are using one of these NICs. Note that this is not by any stretch the only possible cause of "Connection forcibly closed", though - malfuncioning network hardware could give exactly the same symptoms, and used to be the main source of these sorts of error messages before this problem came up.  TCP Chimney is definitely the first place to look, though, both because it is a common problem and because it is easier to troubleshoot than other kinds of misbehaving switch, NIC, etc...

  • Anonymous
    June 05, 2009
    An existing connection was forcibly closed by the remote host

  • Anonymous
    April 01, 2010
    A connection was successfully established with the server, but then an error occurred during the pre-login handshake. (provider: TCP Provider, error: 0 - An existing connection was forcibly closed by the remote host.) (Microsoft SQL Server, Error: 10054)

  • Anonymous
    April 04, 2010
    The comment has been removed

  • Anonymous
    June 06, 2012
    The comment has been removed

  • Anonymous
    December 01, 2015
    Old post, new comment: I found out that my corporate network at work had some ports blocked and that was the reason I got this error. I tried the same connection from a network outside of the corporate one and connection worked right away.