Database Mirroring Monitor can cause large tempdb growth in SQL Server 2005 - changed behavior in SQL Server 2008
On extremely rare occasions in SQL Server 2005, when the following conditions are true:
1. You have a database configured for database mirroring
2. You have replication configured on the same database
3. You are running the database mirroring monitor
4. The SQL Agent job created by the database mirroring monitor is enabled
It was possible to encounter a situation where tempdb grew suddenly and significantly, leading to all the complications that such a situation causes, including but not limited to:
Msg 1105, Level 17, State 2, Line 3
Could not allocate space for object xxxxxxx in database 'tempdb' because the 'PRIMARY' filegroup is full. Create disk space by deleting unneeded files, dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.
Error: 17053, Severity: 16, State: 1.
C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\DATA\tempdb.mdf: Operating system error 112(There is not enough space on the disk.) encountered.
Error: 1101, Severity: 17, State: 2.
Could not allocate a new page for database 'tempdb' because of insufficient disk space in filegroup 'PRIMARY'. Create the necessary space by dropping objects in the filegroup, adding additional files to the filegroup, or setting autogrowth on for existing files in the filegroup.
If you were able to capture this situation and have some monitoring running, you would notice the following event captured in profiler:
An execution of the following which is the main entry point stored procedure for the mirroring monitor:
exec sys.sp_dbmmonitorupdate @database_name
followed by an execution of :
exec @retcode = sys.sp_dbmmonitorMSgetthelatestlsn @database_name, @end_of_log_lsn output
which was the last procedure executed by the mirroring monitor before tempdb started to grow. If you were to actually trace the details of this procedure, you could see the statements which it executes. What you will see is that last statement actually executed before tempdb starts filling is this:
dbcc dbtable(blog_mirror) with tableresults, no_infomsgs
Being that this procedure is actually held in the resource database (the others are in msdb so you can take a look easily if you want to), it is protected from normal viewing, so I won't reproduce it here, but it does exactly what it's name would imply, it gets the latest LSN and then uses this to do some calculations of amount of time required to synchronize and other such functions.
The problem lies here in that in extremely rare occurrences this procedure can cause huge tempdb growth due to the way it calculates the latest LSN using dbcc dbtable (an undocumented internal DBCC command) and stores it's temporary calculations in a temp table. Unfortunately the only way to resolve this problem in SQL Server 2005 is to disable the mirroring monitor job in SQL Agent. You should only consider or need to do this though if you are definitely experiencing this specific problem, i.e. you have all the pre-requisite conditions and you have a profiler trace proving that the dbcc command was the command executed before tempdb filled up.
However the good news is that in SQL Server 2008 we made a design change to collect this data in a far more efficient way. If you look at the results returned from a query of the sys.database_mirroring DMV in SQL 2008:
select *
from sys.database_mirroring
you will note 2 columns have been added to the output:
mirroring_end_of_log_lsn
and
mirroring_replication_lsn
Meaning that the information that the mirroring monitor requires is now available in a much simpler format. Therefore if you look at the SQL 2008 version of the msdb stored procedure sys.sp_dbmmonitorupdate, you will note that the execution of sys.sp_dbmmonitorMSgetthelatestlsn has been replaced with a simple select from the new DMV columns:
select @role = (mirroring_role - 1),
@status = mirroring_state,
@witness_status = mirroring_witness_state,
@failover_lsn = mirroring_failover_lsn,
@end_of_log_lsn = mirroring_end_of_log_lsn
from sys.database_mirroring where database_id = @database_id
which is both simpler, faster and avoids any possibility of the tempdb problem occurring.
Comments
Anonymous
January 04, 2011
Graham thanks for the post from 2008. there was very little else out there. I was investigating this problem for the third time in a couple of months in our installation and found the culprit was indeed sp_dbmmonitorresults. It basically ran for over 2 hours until I killed it taking Tempdb from 5 - 83 Gigs. Problem is, its now Jan 2011 and I am still running on SQL 2005 for a few more months in Production. Was this problem confirmed in SQL 2005? Do you know if there is a Hotfix? Is there also a relationship between this issue and Read Committed Sanpshot which makes extensive use of Tempdb? Is the work around? Turning this Job off will leave me without Database mirroring monitoring statistics which I use to generate Alerts. Sounds to me like the solution in 2008 could have been applied to 2005? Any info you can provide would be great. Thanks...RayAnonymous
January 04, 2011
The comment has been removedAnonymous
May 15, 2013
The comment has been removed