SQL Server: Curious Case of Logging in Online and Offline Index Rebuild in Full Recovery Model

↑ Back to top


Introduction

Microsoft SQL server Forums are filled with lot of questions regarding index rebuild generating huge amount of logs. We will see in this article how logging is different in Offline and Online Index rebuild . We all know that when we rebuild SQL Server index it is dropped and recreated. But is the index physically dropped ? No, it is not physically dropped. What actually happens internally we would see using undocumented commands fn_dblog, DBCC PAGE. This article is going to demonstrate what all operations internally execute when index is rebuild and how much logs are produced. Queries related to this article is run on test machine, please don't use fn_dblog,DBCC PAGE command on your production database. These commands are not supported by Microsoft(although safe to use) and any change to these commands  won't be posted to SQL Server users through Books On Line.

↑ Back to top


Creating Scenario for the Test

To perform this test let's create a  database called IndexLogging . We will use SQL Server 2012 Evaluation edition for this test. Evaluation edition has Enterprise edition features and since ONLINE index rebuild is an Enterprise feature Evaluation edition can be used for the test.

Query 1

USE [master]
GO
  
--Create Database Command
CREATE DATABASE  [IndexLogging]
 ON  PRIMARY
( NAME  = N'IndexLogging_dat',
 FILENAME = N'D:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\IndexLogging_data.mdf' ,
 SIZE = 51200KB ,
 MAXSIZE = 51200KB ,
  FILEGROWTH = 15 %)
 LOG ON
( NAME  = N'IndexLogging_log',
FILENAME = N'D:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL\DATA\IndexLogging_log.ldf' ,
SIZE = 10240KB ,
MAXSIZE = 25600KB ,
FILEGROWTH = 5120KB );
GO
--To remove database from pseudo simple recovery
 
BACKUP DATABASE  [IndexLogging] TO  DISK='D:\Indexlogging1.bak'

After creating the database let's create two tables one Filler table and other table will be  Logging. After both tables are created we will insert few records in them.

Query 2

USE [IndexLogging]
 
GO
--Create filler table
CREATE TABLE  [DBO].[FILLER]
(
C1 INT  IDENTITY,
C2 CHAR(50) DEFAULT  'SOME DATA'
);
GO
 
--Insert default records into filler table
 
INSERT INTO  [DBO].[FILLER] DEFAULT VALUES;
GO 10000
 
--Create table named Logging
 
CREATE TABLE  [dbo].[Logging]
(
    [Id] [int] IDENTITY(1,1) NOT NULL,
    [c1] [char](50) DEFAULT  'SOME MORE DATA' NOT  NULL
);
--Create Unique cluster Index
 
CREATE UNIQUE  CLUSTERED INDEX  IX_LOGGING ON  [DBO].[LOGGING] (ID)
GO
--Insert default records
 
INSERT INTO  DBO.Logging DEFAULT  VALUES;
GO 10000
--check fragmentation for the table Logging
SELECT
    [avg_fragmentation_in_percent]
FROM sys.dm_db_index_physical_stats (
    DB_ID (N'IndexLogging'), OBJECT_ID (N'LOGGING'), 1, NULL,  'LIMITED');
GO



You can see fragmentation is almost 4 %. Actually it is 3.79746835

Now we will fragment the index by dropping Filler table and then running DBCC SHRINKFILE command for data file. Shrinking data file causes fragmentation and thus DBA's avoid shrinking data file. I used this feature to fragment the data file

Query 3



DROP TABLE  DBO.FILLER;
GO
--Shrinking data file to bring fragmentation
DBCC SHRINKFILE (IndexLogging_dat, 3)
GO
SELECT
[avg_fragmentation_in_percent]
FROM sys.dm_db_index_physical_stats (
 DB_ID (N'IndexLogging'), OBJECT_ID (N'LOGGING'), 1, NULL,  'LIMITED');



After shrinking data file fragmentation is now 94 %. Actually it is 93.6708860 %

See how evil DBCC SHRINKFILE command is when used for data file. Fragmentation grew from 4 % to 94 %.

Let's also analyze page numbers corresponding to pages present at root level of index, We will require this information further down the line to reach to various conclusion and to prove different aspects. From SQL Server 2012 onwards UNDOCUMENTED DMV sys.dm_db_database_page_allocations provide great information regarding page number corresponding to data, index an IAM pages.

Query 4
**
**

--Query to get page Information

Use IndexLogging
GO
SELECT allocated_page_file_id,
allocated_page_page_id,
page_type_desc
FROM sys.dm_db_database_page_allocations
(db_id('IndexLogging'), object_id('Logging'), NULL,  NULL, 'DETAILED')
order by
allocated_page_page_id,
page_type_desc



The result is below

Figure 1
**
**

You can see IAM page number is 150, 156 is the index page and other number corresponds to various data pages in data file.The query output gave  81 rows .To see index structure and its level below query can be used

Query 5
**

**

Use IndexLogging
GO
SELECT index_depth AS [Depth of Index],
index_level AS  [Level  of Index],
record_count AS  'No of records Present',
page_count AS  PgCnt,
avg_page_space_used_in_percent AS  'PgPercentFull'
FROM sys.dm_db_index_physical_stats
    (DB_ID ('IndexLogging'),
    OBJECT_ID ('dbo.logging'),
   1, NULL,  'DETAILED');

Below is output

Figure 2
**
**

You can see root level of Index has 79 pages and one page is index page and in figure 1 we have 81 rows as output out of which one page corresponds to IAM and other to Index page, so 79 data pages and one index page and one IAM page is what we can see.

↑ Back to top


Performing Index Rebuild Offline

Now we will rebuild index offline to remove fragmentation in an explicit transaction so that we can capture amount of transaction log it generates. To capture amount of transaction log records generated we will  used DMV Sys.dm_tran_database_transaction. Then we will use fn_dblog to view transaction log entries.

Query 6
**
**

BEGIN TRANSACTION  LOGGING_TEST
USE INDEXLOGGING 
ALTER INDEX  IX_LOGGING ON  [dbo].[Logging]
REBUILD
--(ONLINE=ON)
  
SELECT  database_transaction_log_record_count,
database_transaction_log_bytes_used
    FROM    SYS.DM_TRAN_DATABASE_TRANSACTIONS
    WHERE   database_id = db_id('IndexLogging');
  
  
 COMMIT

Below is the screenshot of the output.

Figure 3

From output you can see 186 is transaction log record count or number of log records generated for this transaction. 667KB of bytes was used by this transaction  for Offline index rebuild. Also note that DMV sys.dm_tran_database_transaction will not gives CORRECT information about log produced as after commit some other transactions are done internally which produces more logs. So we cannot say that database transaction Log record count is JUST 186 it can be more than that

Now let's see what happens internally when Index is rebuilt offline. We will use fn_dblog command to get the output.

Query 7

--Query to look at contents of transaction log

USE INDEXLOGGING

GO

SELECT
OPERATION,
Context,
[Transaction Name],
[AllocUnitId],
[Page ID],
Description,
[Lock Information]
FROM FN_DBLOG(NULL,NULL)
WHERE [Transaction ID] in
(
select [Transaction ID] from
fn_dblog(Null,Null)
where [Transaction Name]='Logging_Test'
)

Below is the snippet of the output. The output produced 187 rows which are same as our output above . Please follow the serial numbers (present at extreme left hand side of the output)

Figure 4

I will explain various related rows in details.

First row with serial number 1 is actually the row for the start of implicit transaction Logging_test. LOP_BEGIN_XACT signifies beginning of a transaction. You can see the the transaction name in [Transaction Name] column.

The rows from serial no 2nd to 8th corresponds to lock taken for metadata update. For row no 2 schema modification lock was taken on table Logging. The lock is shown as  HOBT 0:ACQUIRE_LOCK_SCH_M OBJECT: 9:597577167:0. Which means database ID is 9 and object ID is 597577167. If we query sys.objects for this object id we can get information about this object.

SELECT TYPE,
TYPE_DESC FROM  SYS.OBJECTS
 WHERE OBJECT_ID=597577167

Figure 5
**

**

Above screenshot shows that the object ID 597577167 belonged to table logging and this table is being locked.

The third row has HOBT 327680:ACQUIRE_LOCK_IX OBJECT: 9:5:0 ;ACQUIRE_LOCK_X KEY: 9:327680 (1e2a7b1b9dfa). If you use query mentioned in figure 5 you can see that object ID 5 is Sysrowsets which is a system table and lock was taken to update system tables or you can say a kind of medadata update. Similarly row corresponding to serial no 5 has object ID 7(6:7:0) which is sysallocationunits.

Now row 7th and 8th in there lock information have HOBT id 196608 to get information related to this ID we can use below query

Query 8
**
**

--query to get information about HOBT ID

select  so.name,so.type,so.type_desc,sp.rows,si.name,si.type_desc from  sys.objects so
 join
sys.partitions sp
on
so.object_id=sp.object_id
 join
sys.indexes si
on
so.object_id=si.object_id
where
 hobt_id=196608

Figure 6
**
**

We can see that HOBT ID corresponds to object sysrscols and update was being made to rows which stores information related to clustered index.

In row 9 there is LOP_HOBT_DDL which actually signifies that a new partition would be created for new pages when rebuild will be performed. Each index has at least one partition associated with it. At max an index can have 15000 partitions.

The row corresponding to serial no 12 is important.The operation LOP_MODIFY_ROW with  context is LCK_PFS  actually signifies the formatting of page header  and the relevant entry of the page in PFS  page. If you see description column for this row you will notice Allocated 0001:000000ae.

Now  0001:000000ae means page filenumber:Pagenumber page number. This value can be converted into hexadecimal by just taking ae and converting it into decimal Decimal of ae is 174( Use internet to get Hexadecimal to decimal convertor). Now this 174 is IAM page for the data file.IAM( Index allocation Map) page see figure 7 below.

IAM page stores information of extents which are allocated by a clustered index or a non clustered index so actually IAM page was modified with extents being allocated for new index. I will show below that page 174 is IAM page for the database again using DMV Sys.dm_db_database_page_allocations

Lock information clearly mentions that 174 is the page being locked for making the change (HOBT 6488064:ACQUIRE_LOCK_X PAGE: 9:1:174 )

Query 9

SELECT  page_type_desc,
        page_level,
        previous_page_page_id,
        allocated_page_page_id,
        next_page_page_id
FROM    sys.dm_db_database_page_allocations
        (db_id(), OBJECT_ID('dbo.logging', 'U'), 1,  DEFAULT, 'DETAILED')

Below is the output obtained after running above query. And you can see that page 174 is IAM page

Figure 7

Now this IAM page was updated because in Figure 4 above  if you see rows 16-24 it all has operation name LOP_FORMAT_PAGE which means operation that corresponds to writing on header of newly allocated page.Which means just after metadata update operation Index rebuild operation is first formatting the headers of new page, which will be allocated during index rebuild. These pages would be data pages. Interesting thing to note is PageID ( See figure 4)

0001:00000170= Page no 368 ( use convertor to convert hex page to decimal)

0001:00000171= Page no 369

0001:00000172= Page no 370

0001:00000173= Page no 371

0001:00000174= Page no 372

0001:00000175= Page no 373

Now lets see what this pages corresponds to. If you see in figure 7( Just above)  page 368 is first page present at leaf level of the index. Page 369 is second page, so on and so forth till row number 93( please go down to the output produced in Figure 4)

Now let's move to row 113 to 115 as per below figure

Figure 8
**
**

We can see that pages are getting deallocated.The context is LOCK_PFS , PFS page stores amount of free space on page and this is what updated in PFS page about pages that are getting de-allocated. Interesting thing to note is page is not deleted it is just marked as deallocated in PFS page see LCK_PFS in context column. This operation of just marking the pages as de-allocated and changing its bit has advantage. First it speeds up the operation second if operation has to rollback( like in implicit transaction) it would be fast as just PFS and IAM pages need to be updated about the record instead of allocating whole page again.Once pages is de-allocated record is present on the page till it is re-utilized and overwritten.

Interesting thing to note here is row 102 the operation is LOP_SET_BITS the page ID is 0001:000000ae and context is Lock_IAM. Page ae is actually page 174 which is IAM page. Please see fig 1. LOP_SET_BITS signifies  IAM page is updated about pages that are free. We all know that  IAM page stores information of extents which are allocated by a clustered index or a non clustered index .

 Lets take example of page 1c8 or page 456.Using DBCC PAGE command we can show that page is marked de-allocated

Query 10
**

**

dbcc traceon (3604) 
go 
dbcc page (Indexlogging,1,456,1)

Figure 9
**
**

From above you can see page 456 is marked as deallocated. A simple look at the lock information for rows 93 to 95 tells us the page that was being locked and context column show LCK_PFS which means for this page just it is marked as de-allocated in PFS page.

Lets now move to output for row no 124 to 147

Figure 10

Deallocated 0001:0000009c=156

Deallocated 0001:000000a3=163

Deallocated 0001:000000a4=164

Deallocated 0001:000000c8=200

Deallocated 0001:000000d8=216

Deallocated 0001:000000e8=232

If you refer to figure 1 all these pages belonged to index before it was rebuilt. All what is happening is deallocation of pages from by updating PFS and IAM pages

 Now lets move to last part of output

Figure 11
**
**

LOP_DELETE_ROWS signifies rows being removed from metadata which corresponded to index before it was rebuilt.`For row 166 it has page ID 92 which is actually page no 146. This page is index page and corresponds to Object ID 5( Use query 10 to see information about this page) object ID 5 is sysrowsets . The important thing to note here is lock being taken on object ID 5 and 3. Which are system tables If you look at output of row 173  you can see lock HOBT 562949956960256:ACQUIRE_LOCK_IX OBJECT: 9:54:0 ;ACQUIRE_LOCK_X KEY: 9:562949956960256 (a444422dc013). Lock is being taken on database ID 9 and object ID 54 . Object ID 54 is sysidxstats which means that when index is rebuilt with full scan it update stats for the column as well.

Finally the LOP_INSERT_ROWS to update the metadata about index which is re-created.Pages were formatted before the last part was updating various system tables regarding new pages which are now used for index.

↑ Back to top


Performing Index Rebuild Online.

If you performing the test for Online index rebuild I would like you to drop the database IndexLogging and run queries (query 1 ,2,3) to create a new database and then finally create fragmentation in it.After creating fragmentation lets rebuild index online. Online index rebuild is feature only in Enterprise,Developer and evaluation edition.

Query 12
**
**

--Implicit transaction to view amount of transaction logs generated
BEGIN TRANSACTION  LOGGING_TEST
USE INDEXLOGGING 
ALTER INDEX  IX_LOGGING ON  [dbo].[Logging]
REBUILD with
(ONLINE=ON)
  
SELECT  database_transaction_log_record_count,
database_transaction_log_bytes_used
    FROM    SYS.DM_TRAN_DATABASE_TRANSACTIONS
    WHERE   database_id = db_id('IndexLogging');
  
  
 COMMIT

Below is the output of the query

Figure 12
**
**

Online index rebuild with default options produced just 95 log records and just used 9 KB of log space. This is very much less as compared to log records produced during offline index rebuild.

Lets run fn_dblog command to analyze behavior of log in Online index rebuild running same command as in Query 5.Below is the result

The output just produced 74 rows as compared to 121 rows produced in online index rebuild.

Figure 13

The interesting thing to see if there is no page allocation information LOP_FORMAT_PAGE like what we saw in Offline index rebuild.See figure 4 for more details.Just the de-allocations of page through PFS and IAM pages update.this is main reason for less logging of records in ONLINE index rebuild. I guess by now you would have understood pages de-allocated were actually those belonging to Data pages at root level of index see figure 1

See description column

Deallocated 0001:000000a5=Page 165

Deallocated 0001:000000a6=Page 166

Deallocated 0001:000000a8=Page 168

Deallocated 0001:000000a9=Page 169

Again pages are deallocated from IAM pages and PFS page is updated.No physical delete operation happens

Finally what comes is last part of output is below. Operation is same and is explained above for Offline index rebuild.Deletion of rows from metadata and finally inserting of information into metadata for new index.Lock information is very helpful in telling on which object lock was taken. and the same would be modified.

If you go down to output you will see records quite similar to in Offline Index rebuild.Which I guess don't needs explanation as they have been explained above.

↑ Back to top


Conclusion

Following are conclusions drawn from the above post.

1.During Offline index rebuild page allocations are done before de-allocating the pages corresponding to old index.This only happens in Offline mode and causes log to grow

2.. Index pages is not actually dropped pages are just de-allocated, de-allocation is done by removing entries from PFS and IAM pages. In general term you can call it deleted but correct term would be de-allocated.

  1. In online index rebuild pages are de-allocated first and then DDL for new pages are being created.

Note

I would further take up logging in rebuilding indexes in Bulk Logged recovery model in my next article. It could have been included in this article but it would have made this article very lengthy and tedious.

↑ Back to top


Suggested Readings

Understanding GAM SGAM and PFS pages

Choosing recovery model for index operation

How to use DBCC PAGE command

Inside into storage engine GAM SGAM and PFS pages

DBCC Shrinkfile

Sys.dm_db_index_physical_stats

Alter Index Command


See Also

SQL Server General & Database Engine Resources on the TechNet Wiki

List Of articles by Shanky