How multiple connections from the same app can kill perf

It's pretty common for a complex app to make multiple connections to SQL Server.  And, on SQL Server 2000 and earlier releases, it's also very easy for those types of app to cause perf problems by distributing their workload across their connections unevenly.  To understand why this is, let's explore what happens when connecting to SQL Server.

When a client connects to SQL Server, it is assigned to a UMS scheduler using a simple round robin algorithm.  The scheduler with the fewest users gets the new one.  Generally speaking, a scheduler is simply a logical representation of a CPU.  Although schedulers are not affinitized to specific CPUs and although the server can create more schedulers than there are physical CPUs in the machine, there is a loose correlation between UMS schedulers and CPUs.  For each CPU on the system to which SQL Server has access, it will create a UMS scheduler to carry out client work requests (e.g., T-SQL language events, RPCs, etc.)

Each client is assigned to a UMS scheduler when it connects to SQL Server and never changes schedulers thereafter.  This second point is important.  Regardless of what's happening elsewhere on the server, a connection never changes schedulers.

Even though UMS schedulers are not hard affinitized to specific CPUs, because of the way that Windows schedules threads, a soft affinity generally occurs wherein a given UMS scheduler ends up always running on the same physical CPU.  Although Windows can and will move UMS schedulers around as it pleases, a given UMS scheduler usually stays with a given physical CPU for extended periods of time.

So, let's say you have an app that opens four connections into SQL Server, and let's say SQL Server has four CPUs available to it.  For the sake of simplicity, let's say your app makes all four connections at startup.  Provided the current connections into the server are already evenly distributed across its UMS schedulers when your app connects, each of the app's four connections will be assigned to a different UMS scheduler.  This will spread the potential work load evenly across the CPUs available to the server.

Now, what happens if those four connections submit workloads to the server that vary widely in size and resource requirements from one another?  What happens if the first two connections actually do all the work, while connections three and four are barely used?  Obviously, whole CPUs go virtually unused.  Compound this with multiple instances of your app, and you can quickly run into contention for CPU (and even memory) resources on two CPUs, while the other two sit around idle most of the time.  Because UMS assigns a client to a scheduler when it first connects and because each UMS scheduler winds up more or less soft affinitized to a given CPU, it's quite easy to get into a situation where a workload imbalance causes perf issues.  The problem is so acute in some cases that I've even heard of apps using undocumented commands to determine what scheduler they end up on after connecting and intentionally disconnecting/reconnecting until they get the scheduler-connection alignment they think best.  I don't recommend this, however.  There is, fortunately, a much better solution.

The ideal solution is also the most obvious one:  if an app opens multiple connections into SQL Server, attempt to spread its workload evenly across those connections.  Then you don't care which scheduler or CPU you end up on because it really doesn't matter.  And keep in mind that with the advent of MARS (Multiple Active Result Sets) in SQL Server 2005, you may be able to get away from opening multiple connections from a single app altogether.

Comments

  • Anonymous
    October 24, 2005
    Good info. But, how does connection pooling (specifically in .NET) relate to this? If you have a single app, and open/close your connections, which connection will be used each time you try to connect()? The first one available, does it round robin the connection usage, something else?

  • Anonymous
    October 24, 2005
    Great article but...
    "attempt to spread its workload evenly across those connections"

    How ?

    I have a VB or Access or what ever Client that calls an SP on the server - how do I build in the intelligence to best make use of the available CPUs ?

  • Anonymous
    November 03, 2005
    How it works with Java connection pooling? We are using Java in frontend with IBM Websphere as App server and connecting SQLServer JDBC driver.
    Thanks.
    --
    Farhan

  • Anonymous
    December 09, 2005
    <i>We are using Java in frontend with IBM Websphere as App server and connecting SQLServer JDBC driver. </i>

    On purpose? <g>

  • Anonymous
    February 28, 2006
    PingBack from http://zbowling.com/blog/2005/11/06/first-thoughts-c-20net-20vs-2005-mssql-2005/

  • Anonymous
    May 06, 2006
    Nice site!
    [url=http://ofznviyl.com/xxwj/lmxo.html]My homepage[/url] | [url=http://uallmxud.com/ptgp/qeym.html]Cool site[/url]

  • Anonymous
    May 06, 2006
    Thank you!
    <a href="http://ofznviyl.com/xxwj/lmxo.html">My homepage</a> | <a href="http://jxarobsq.com/mwoe/sdka.html">Please visit</a>

  • Anonymous
    May 06, 2006
    Good design!
    http://ofznviyl.com/xxwj/lmxo.html | http://nkvakikj.com/tcle/aicx.html

  • Anonymous
    May 15, 2006
    online directory main

  • Anonymous
    September 19, 2006
    Really nice Info BUT:

    - If i had the option to only use a single connection i would obviously do so.
    - If it would be easily possible to spread the workload across multiple connections there would be little reason to use multiple connections.
    - If multiple connections from one application can cause those problems then multiple connections from multiple applications using one connection each could also cause those problems.

  • Anonymous
    September 20, 2006
    >If i had the option to only use a single connection i would obviously do so.

    You might, but a knowledgeable user might not.  Whether or not you use a single connection depends on what you're doing.  It's far from obvious, especially to people who know what they're doing.

    >If it would be easily possible to spread the workload across multiple connections there would be little reason to use multiple connections.

    That's ridiculous.  There are lots of reasons for using multiple connections independent of whether the workload can be easily spread.

    >If multiple connections from one application can cause those problems then multiple connections from multiple applications using one connection each could also cause those problems.

    No kidding?  Did you figure that out all on your own?  What does that have to do with this article?  It probably isn't obvious to you, so I'll spell it out:  you have much more control over the work spread across multiple connections from a single app than you do that across multiple connections from multiple apps.

  • Anonymous
    January 03, 2007
    The comment has been removed

  • Anonymous
    March 13, 2007
    Does anyone know the negative effects of storing session states from multiple applications on the same SQL server?

  • Anonymous
    September 12, 2007
    What is the difference between a UMS scheduler and a OS scheduler ?

  • Anonymous
    June 01, 2009
    PingBack from http://paidsurveyshub.info/story.php?id=75494