How to avoid a CritSit. An important Coding Best Practice

What is a CritSit?

It’s short for “Critical Situation”, and basically, it's when my red phone rings. It means I have to drop everything and jump on a plane because there is a SharePoint server somewhere in Europe, Middle East or Africa that needs my help.

Needless to say this is something we really want to try to avoid.

  1. It’s bad for our customers because a service important to their business is not operational.
  2. It reflects badly on the product, because the initial assumption is that its SharePoint at fault.
  3. It’s bad for Microsoft because it’s pretty expensive to fly me around.
  4. It’s bad for me because my dinner plans usually get cancelled.

The interesting thing is, when I look back over the CritSit’s that I have responded to over the past year or so, a significant number of them follow this pattern:

  1. SharePoint has been deployed, its running nicely, people start using it
  2. Deployment grows, people have feedback and want to change aspects of the products functionality
  3. Advanced "Tree Navigation", both at the portal level, and within team sites, is a very popular request
  4. A developer works on a fancy tree navigation control, tests and then deploys
  5. The SharePoint environment becomes unstable with the following symptoms:
    1. SharePoint page render times become slower
    2. Over time this gets worse and worse
    3. Portal or team sites eventually become unresponsive
    4. On the server the W3WP process is consuming lots of memory
  6. At this point, the W3WP process is recycled, either by the application pool, or manually, and the Portal comes back to life
  7. However, over time, the exact same issue repeats itself

This whole process might happen multiple times per day, or just once or twice per week. It is intermittent, so it's hard to reproduce, and often people just "live" with it.

Now, the reason it is hard to reproduce, and why it appears intermittent, is because the amount of load on the server has a big impact on how frequently it occurs. The more users on the system, the more frequently the SharePoint "crash" happens.

When I arrive onsite and find myself facing such an issue I usually start by doing the following:

  1. Build up a test environment, ensuring it as closely as possible resembles the production environment, including customisations
  2. Restore the production data into the test environment
  3. Configure at least one Application Center Test client
  4. Create a simple script, with maybe 20 users, and then use it to start hitting the server

In most cases I’m able to reproduce the problem fairly reliably. For example, I can say "Running test script X with 20 users for X minutes will always cause a 'crash'". At this point I typically remove the "Tree Navigation" control and re-run the script, only this time of course there is no crash, thereby proving the control is causing the issue.

With this testing complete, and our problem control identified we can look into it in more detail. Here there is also a pattern, in nearly every occasion the cause has been one of two things:

  1. Custom code not disposing of SPWeb or SPSite objects correctly.
  2. Portal or Team Sites that breach the capacity planning guidelines, particularly around sites and document libraries.

The good news is that we *finally* have a great whitepaper that describes, in detail ,some of the key code practices you should implement in order to avoid this type of issue, and therefore avoid a CRITSIT:

Best Practices: Using Disposable Windows SharePoint Services Objects

Of course, the other way of avoiding such issues is to ensure your code is fully tested, including complete load testing. It is incredibly important that you create an environment, with production data, that can be used to simulate the production load. This environment then allows you to develop baselines, which you can use to determine the overall impact of a particular customisation.

Anyway, take a look at the whitepaper, and good luck coding!

 

Written and posted using Microsoft Word 2007 Beta 2

Comments

  • Anonymous
    June 26, 2006
    PingBack from http://www.ararat.org.il/sharepoint-webparts-programming
  • Anonymous
    July 25, 2006
    Hello Mr. McPherson,

    I just researching on Share Point, and really interested in it. I just have one question, which may you can answer me, because I didn't find it anywhere and still have no clue about it.

    Can Share Point used to develop a custom web application, let we say a complete schooling attendance system, from student log, teacher log, administration reporting, etc, with a rather very different layout and interface than ordinary typical Share Point portal? Let we say it, we use Share Point as our development framework for our web based application?

    Is that possible? If yes, can you share some references?

    Regards,
    AW

  • Anonymous
    July 28, 2006
    Very sorry to interrupt you.
    This problem has puzzled me so long and I don't know who else I can ask for help except you.
    I deployed the site to Form Authentication and I want to let the users to register on the site.After the user input the information I want the system add the user account to the site group so that all go on automatically.
    I have three ideas:
    1 Put a webpart including a CreateUserWizard control into the MOSS 2007 site's default page.
    2 Build a ASPNET website in the MOSS2007 site(something like"http://ServerName:PortNumber/_layouts/MyWeb") including a registering page with a CreateUserWizard control.
    3 Build a individual ASPNET website.
    And I put the Group.Add method in the CreateUserWizard_CreatingUser event.
    In the first or second condition,the site will display 401 error
    or turn to the Login page of the sps site directly.(when I logon the site with a appropriate account and this will be done with no problem)
    In the third condition, I can't manage to let the user logon the site automatically because the registering page has not been  associated with the sps site.
    Any ideas?
    With many many thanks.
    Joy
  • Anonymous
    August 17, 2007
    Body: Just remembered, Sunday April 1 st was the three year anniversary of Point2Share. The last year