SQL Server Change Data Capture (CDC)
Introduction
"Change data capture is designed to capture insert, update, and delete activity applied to SQL Server tables, and to make the details of the changes available in an easily consumed relational format. The change tables used by change data capture contain columns that mirror the column structure of a tracked source table, along with the metadata needed to understand the changes that have occurred." -MSDN Article
Real World Scenario
There are two scenarios where Change Data Capture can play an important part:
1. For data auditing
2. For Use with Data warehousing
In the first scenario, CDC can be very useful for tracking data changes for auditing purposes.
A concrete example for this is a bank application where a depositor would query
what has happened to his account and how his account has changed over time
or in a much simpler term a historical view of his count.
Another scenario where CDC can play an important part is an accounting application
where a dispute may usually happen.
CDC, however, is designed for a short time audit because it has a retention period. However, you
can disable retention and cleansing of CDC data. The default retention for CDC is three days.
Any application that would require an immediate access to the tail-end part of the transaction
would be a great candidate for immediate audit requirements.
The second scenario where CDC can play an important part is in data warehousing. CDC table can be
used as data source for the ETL rather than the actual OLTP this is due to the fact that CDC can contain
a more historical data as compared to the regular OLTP tables. Another good reason is that you can do the
ETL without disturbing the base tables.
Performance
The strongest selling point of CDC is performance. As you can see from the diagram below.
CDC tables are taken directly from the database log without disturbing your OLTP tables.
The downside, of course, is that the logs cannot be trimmed unless CDC has completed its task.
Although CDC is making use of transaction logs, the database is not required to be in full recovery mode.
Figure 1. CDC Architecture. Diagram courtesy of *MSDN
*
System Requirements
CDC works on the following version of SQL Server:
1. Enterprise edition
2. Developer edition
Change Data Capture Step-by-Step
Step #1. Create a database to be enabled for CDC
* Create database cdctestdb*
Step #2. Enable CDC on the database
* use cdctestdb
exec sys.sp_cdc_enable_db*Step #3. Verify that CDC is enabled on the database
select name, is_cdc_enabled from sys.databases
CDC will also create system table in the database
The System table which have been created under CDC schema are listed below:
cdc.captured_columns – This table returns result for list of captured column.
cdc.change_tables – This table returns list of all the tables which are enabled for capture.
cdc.ddl_history – This table contains history of all the DDL changes since capture data enabled.
**cdc.index_columns **– This table contains indexes associated with change table.
cdc.lsn_time_mapping – This table maps LSN number (for which we will learn later) and time.Step #4. Create a Table
CREATE TABLE [dbo].[students](
[student_id] [int] IDENTITY(1,1) NOT NULL,
[lastname] [varchar](50) NULL,
[firstname] [nchar](50) NULL,
[middle_name] [nchar](50) NULL,
CONSTRAINT [PK_students] PRIMARY KEY CLUSTERED
(
[student_id] ASC
))
What's new
Step #4. Enable CDC on the Table
*exec sys.sp_cdc_enable_table
@source_schema=N'dbo',
@source_name=N'students',
@role_name=NULL
*Step #5. Verify that CDC is enabled in the table
select name,is_tracked_by_cdc from sys.tables
where is_tracked_by_cdc=1results
name is_tracked_by_cdc
------ -------------------
students 1Step #6. Insert records
insert into students (lastname,firstname,middle_name)
values
('student1ln','student1fn','student1mn'),
('student2ln','student2fn','student2mn'),
('student3ln','student3fn','student3mn')Step #7. Check if the inserted record are being tracked
Notice that SQL Server will create a system table to track changes.
Run a select query on the cdc table to see audit entries
Step #8. Disable CDC
USE cdctestdb
GO
EXEC sys.sp_cdc_disable_db
GO
What's new in Change Data Capture in SQL Server 2012?
A new feature of Change Data Capture in SQL Server 2012 is the addition of Change Data Capture
in SQL server Data Capture for Oracle database. For more information please see the link below.
http://blogs.msdn.com/b/mattm/archive/2012/03/26/cdc-for-oracle-in-sql-server-2012.aspx