Troubleshooting database growth issues in Configuration Manager 2007

Hi Everyone, my name is Prakash and I recently worked an interesting case that I wanted to share with you here.  The issue was that the ConfigMgr 2007 database had suddenly grown quite large and threatened to consume all available disk space.  This turned out to be caused by excessive status messages and I thought it might be helpful to blog this with the complete troubleshooting steps you can follow just in case you happen to face a similar issue.

Issue:  The SCCM database size may grow to a very large size and disk space may begin to run out. 

Troubleshooting: To find out what tables are consuming most of the space, run the SQL script from the following link:

http://www.sqlteam.com/article/finding-the-biggest-tables-in-a-database

In our case we discovered that the three status message tables were consuming most of the disk space as you can see below:

Table Name                                       Size        Rowcount          

StatusMessageinsstrs                           25GB      485,117,953

StatusMessages:                                 15GB      121,476,253

StatusMessageAttributes                      11GB      213,802,902

You can next use the SQL command below to find out which component is causing these status messages:

select Component, count(*) from vStatusMessages group by Component order by count(*) desc

OUTPUT:-

Component                                                                    Count

Unknown SMS Application                                              119720070
Windows Installer SourceList Update Agent                        53191
Software Updates Scan Agent                                         31804
Software Distribution                                                      22932
Microsoft.ConfigurationManagement.dll                             20371
Software Distribution Content Access                                6527

What we found was that the Unknown SMS Application is generating maximum status messages.  We then used the SQL query below to find out which message IDs are the ones which are generated in bulk and found that 30061s and 30007s were the problematic Message IDs:

select MessageID, count(*) from vStatusMessages group by MessageID order by count(*) desc

Output:-

Message ID            Count
30061                    89707427
30007                    30041666
10093                    33231
10090                    19941
11423                    18708
11424                    12280

If you look at the details of this status message you can see that a 30061 is generated when we try to modify the instance security for user and a 30007 when user modified the advertisement:

30061 Informational User "<User>" modified instance security rights for user "<UserName>" on instance "<InstanceKey>" of object type "<ObjectKey>".
30007 Informational User "<User>" modified the Advertisements Properties of an advertisement with advertisement ID <AdvertisementID>.

Based on this we began to look for any kind of scheduled task that might be doing this activity and found a couple custom scripts that were looping every 60 seconds.  These scripts were doing things such as moving Advertisements from the Advertisement Root into an Advertisement folder and setting instance rights to an AD global group for newly created collections that were not assigned to the AD global group at the class level.

Resolution:  The Delete Aged Status Messages Task properties dialog box does not allow you to specify an age for the messages to delete. Instead, you must configure the age of messages to delete by using status filter rules in the Configuration Manager console. The default status filter rules keep audit messages for 180 days and all other messages for 30 days.

In the above scenario, as they are audit messages, the existing data will be deleted from the database only after 6 months so we followed the action plan below to prevent the issue from reoccurring and to reduce the number of days to retain the data in the database for these messages IDs:

1. We reduced the frequency of the script execution to every 2 hours instead of every 60 seconds.

2. It is not a good practice to reduce the number of days to retain the data for Audit messages to a lower value for all audit messages as status messages might be required to diagnose the Configuration Manager 2007 system.  To change these intervals for the problematic status messages we created two status filter rules: One for Message ID 30061 and one for ID 30007 and set the number of days to a lower value.

Once the steps above are complete, the old data will gradually be deleted and the new data will be retained based on the days specified in step #2.

*Note: This information was originally contributed by Prakashan A K, Support Escalation Engineer, on the Configuration Manager Support Team blog:
*

http://blogs.technet.com/configurationmgr/archive/2009/01/27/troubleshooting-database-growth-issues-in-configuration-manager-2007.aspx