How to: Use a Custom Change Tracking System
Many applications require that changes be tracked in the server database so that these changes can be delivered to clients during a subsequent synchronization session. This topic describes the requirements for a change tracking system, and shows you how to create a custom system that Sync Framework can use. Custom change tracking is appropriate in some cases. However, be aware that it does introduce complexity and can affect server database performance. If you are using SQL Server 2008, we recommend that you use the SQL Server change tracking feature. For more information, see How to: Use SQL Server Change Tracking.
Server Requirements for Synchronization Scenarios
Sync Framework is designed to have minimal effect on the server database. Therefore, modifications that are required for change tracking in the server database are proportional to the level of functionality that you want in an application. Keep in mind the following considerations:
On one end of the spectrum, there is a download-only snapshot of data. This requires no changes.
On the other end, there is bidirectional synchronization that uses full change tracking and conflict detection.
The following table summarizes the ways in which you can use Sync Framework, and identifies the corresponding requirements for the server database.
Scenario |
Primary key or unique column1 |
Track update time |
Track insert time |
Track delete time |
Track client ID for updates |
Track client ID for inserts |
Track client ID for deletes |
---|---|---|---|---|---|---|---|
Download a snapshot of data to the client. |
No |
No |
No |
No |
No |
No |
No |
Download incremental inserts and updates to the client. |
Yes |
Yes |
Yes2 |
No |
No |
No |
No |
Download incremental inserts, updates, and deletes to the client. |
Yes |
Yes |
Yes2 |
Yes |
No |
No |
No |
Upload inserts to the server. |
Yes |
No |
No |
No |
No |
No3 |
No |
Upload inserts and updates to the server. |
Yes |
No |
No |
No |
No3 |
No3 |
No |
Upload inserts, updates, and deletes to the server. |
Yes |
No |
No |
No |
No3 |
No3 |
No3 |
Bidirectional inserts and updates with conflict detection. |
Yes |
Yes |
Yes2 |
No |
Yes4 |
Yes4 |
No |
Bidirectional inserts, updates, and deletes with conflict detection. |
Yes |
Yes |
Yes2 |
Yes |
Yes4 |
Yes4 |
Yes4 |
1 Primary keys must be unique across all nodes, and must not be reused: If a row is deleted, the primary key of that row should not be used for another row. Identity columns are typically not an appropriate choice for distributed environments. For more information about primary keys, see Selecting an Appropriate Primary Key for a Distributed Environment.
2 Required if you want to distinguish between inserts and updates. For more information, see "Determining Which Data Changes to Download to a Client" later in this topic.
3 Required if more than one client might change a row and you want to identify which client made the change. For more information, see "Identifying Which Client Made a Data Change" in this topic.
4 Required if you do not want to echo changes back to the client that made them. For more information, see "Identifying Which Client Made a Data Change" in this topic.
Note
In addition to the changes that are described earlier, you will likely create stored procedures for data access. Most of the examples in this documentation use inline SQL because it is easier to show what is occurring in the code. In production applications, stored procedures should be used for the following reasons: they encapsulate code, typically perform better, and can provide increased security over inline SQL if they are written correctly.
Determining Which Data Changes to Download to a Client
In download-only and bidirectional synchronization, you must track changes on the server so that Sync Framework can determine which changes to download to clients. Although Sync Framework does not define specifically how to support change tracking, there is a common way to approach it. For each table to be synchronized, you can take the following approach:
Add a column that tracks when a row was inserted in the server database.
Add a column, and in some cases a trigger, that tracks when a row was last updated in the server database.
Add a tombstone table and a trigger that tracks when a row was deleted from the server database. If you do not want to delete data from the server but have to send deletes to the client, logical deletes can be tracked in the base table: Use a column, typically of type bit, to indicate that a row is deleted and another column to track when the delete occurred.
These columns and tombstone tables are used together with anchors to determine which inserts, updates, and deletes to download. An anchor is just a point in time that is used to define a set of changes to synchronize. Consider the following queries:
The query that you specify for the SelectIncrementalInsertsCommand property. This query downloads incremental inserts from the Sales.Customer table in the Sync Framework sample database, as follows:
SELECT CustomerId, CustomerName, SalesPerson, CustomerType FROM Sales.Customer WHERE InsertTimestamp > @sync_last_received_anchor AND InsertTimestamp <= @sync_new_received_anchor
For more information about this property and other properties that are related to synchronization commands, see How to: Specify Snapshot, Download, Upload, and Bidirectional Synchronization.
The query that you specify for the SelectNewAnchorCommand property. This query retrieves a point-in-time value. The InsertTimestamp column stores timestamp values. Therefore, the query uses the Transact-SQL MIN_ACTIVE_ROWVERSION function, introduced in SQL Server 2005 Service Pack 2, to retrieve a timestamp value from the server database, as follows:
SELECT @sync_new_received_anchor = MIN_ACTIVE_ROWVERSION - 1
MIN_ACTIVE_ROWVERSION returns the lowest active timestamp (also known as rowversion) value in the current database. A timestamp value is active if it is used in a transaction that has not yet been committed. If there are no active values in the database, MIN_ACTIVE_ROWVERSION returns the same value as @@DBTS + 1. MIN_ACTIVE_ROWVERSION is useful for scenarios such as data synchronization that use timestamp values to group sets of changes together. If an application uses @@DBTS in its anchor commands rather than MIN_ACTIVE_ROWVERSION, it is possible to miss changes that are active when synchronization occurs.
When the Sales.Customer table is first synchronized, the following process occurs:
The new anchor command is executed. The command returns the value 0x0000000000000D49. This value is stored in the client database. The table has never been synchronized. Therefore, there is no anchor value that was stored in the client database from a previous synchronization. In this case, Sync Framework uses the lowest value that is available for the SQL Server timestamp data type: 0x0000000000000000. The query that Sync Services executes is as follows. This query downloads the schema and all rows from the table.
exec sp_executesql N'SELECT CustomerId, CustomerName, SalesPerson, CustomerType FROM Sales.Customer WHERE (InsertTimestamp > @sync_last_received_anchor AND InsertTimestamp <= @sync_new_received_anchor)',N'@sync_last_received_anchor timestamp, @sync_new_received_anchor timestamp', @sync_last_received_anchor=0x0000000000000000, @sync_new_received_anchor=0x0000000000000D49
During the second synchronization, the new anchor command is executed. Rows have been inserted since the last synchronization. Therefore, the command returns the value 0x0000000000000D4C. The table has been synchronized before. Therefore, Sync Framework can retrieve the anchor value 0x0000000000000D49. This value is stored in the client database from the previous synchronization. The query that is executed is as follows. The query downloads only those rows from the table that were inserted between the two anchor values.
exec sp_executesql N'SELECT CustomerId, CustomerName, SalesPerson, CustomerType FROM Sales.Customer WHERE (InsertTimestamp > @sync_last_received_anchor AND InsertTimestamp <= @sync_new_received_anchor)', N'@sync_last_received_anchor timestamp, @sync_new_received_anchor timestamp', @sync_last_received_anchor=0x0000000000000D49, @sync_new_received_anchor=0x0000000000000D4C
For examples of update and delete commands, see How to: Download Incremental Data Changes to a Client and How to: Exchange Bidirectional Incremental Data Changes Between a Client and Server.
As noted earlier, the command that is used to retrieve anchor values depends on the data type of the tracking columns in the server database. The examples in this documentation use SQL Server timestamp, also known as rowversion. To use a SQL Server datetime column, the query for new anchor command would be similar to the following:
SELECT @sync_new_received_anchor = GETUTCDATE()
To determine which data type to use for an anchor, you should weigh the application requirements and consider how much flexibility you have to change the server database schema. If the database is in development, you could specify exactly which columns and triggers to add. If the database is in production, your options might be more limited. Consider the following guidelines:
All tables in a synchronization group should use the same data type and new anchor command. If you can, use the same data type and command for all groups.
The datetime data type is easy to understand, and tables frequently already have a column that tracks when a row was modified. However, this data type can be problematic if clients are in different time zones. If you use this data type, transactions could be missed when incremental changes are selected.
The timestamp data type is precise and does not depend on clock time. However, each table in a SQL Server database can contain only one column of this data type. Therefore, if you must distinguish between inserts and updates, you can add a column of a different data type, such as binary(8), and store timestamp values in that column. For an example, see Setup Scripts for Database Provider How-to Topics. The timestamp data type could be an issue if the server database is restored from a backup. For more information, see Backing Up and Restoring a Database. As noted earlier, we recommend that you use MIN_ACTIVE_ROWVERSION in the command that selects a new anchor.
Identifying Which Client Made a Data Change
There are two main reasons to identify which client made a data change:
To support conflict detection and resolution in upload-only and bidirectional synchronization.
If the server and client or more than one client can change a given row, you might want to identify who made the change. This information lets you write code, for example, that prioritizes one change over another. Without this information, the last change made to the row is persisted.
To prevent echoing of changes back to the client during bidirectional synchronization.
Sync Framework first uploads changes to the server, and then downloads changes to the client. If you do not track the identity of the client that made a change, the change will be uploaded to the server and then downloaded back to the client during the same synchronization session. In some cases, this echoing of changes is acceptable, but in other cases it is not.
As with change tracking, Sync Framework does not specifically define how to support identity tracking; however, there is a common way to approach it. For each table to be synchronized, you can take the following approach:
Add a column to the base table that tracks who made each insert.
Add a column to the base table that tracks who made each update.
Add a column to the tombstone table that tracks who made each delete.
These columns and tables are used with the ClientId property to determine which client made each insert, update, or delete. The first time any table is synchronized by using a method other than snapshot synchronization, Sync Framework stores a GUID value on the client that identifies that client. This ID is passed to the DbServerSyncProvider so that it can be used by the select and update queries in each SyncAdapter. The ID value is available through the ClientId property. Consider the following Transact-SQL query:
SELECT CustomerId, CustomerName, SalesPerson, CustomerType FROM
Sales.Customer WHERE InsertTimestamp > @sync_last_received_anchor AND
InsertTimestamp <= @sync_new_received_anchor AND InsertId <>
@sync_client_id
This query is similar to the earlier query that tracks inserts that are made on the server. The statement in the WHERE clause ensures that the only inserts that are downloaded are those inserts that were not made by the client that is currently synchronizing.
Sync Framework also enables applications to identify clients by using an integer at the server instead of a GUID value. For more information, see How to: Use Session Variables.
Server Preparation Examples
The following examples show how to set up the Sales.Customer table from the Sync Framework sample database with the tracking infrastructure to handle the most complex application scenario: bidirectional insert, update, and delete operations with conflict detection. Less complex scenarios do not require the whole infrastructure. For more information, see "Server Requirements for Synchronization Scenarios" earlier in this topic. For a complete script that creates the objects in this example and additional objects, see Setup Scripts for Database Provider How-to Topics. For more information about how to use these objects, see How to: Specify Snapshot, Download, Upload, and Bidirectional Synchronization.
The examples in this section perform the following steps in preparing a server:
Verify the Sales.Customer schema. Determine whether the table has a primary key and any columns that could be used for change-tracking.
Add columns to track when and where inserts and updates are made.
Create a tombstone table, and add a trigger to the Sales.Customer table to populate the tombstone table.
Verifying the Sales.Customer Schema
The following code example shows the schema of the Sales.Customer table. The table has a primary key on the CustomerId column and has no columns that could be used for change tracking.
Adding Columns to Track Insert and Update Operations
The following code example adds four columns: UpdateTimestamp, InsertTimestamp, UpdateId, and InsertId. The UpdateTimestamp column is a SQL Server timestamp column. This column is automatically updated when the row is updated. As it has been noted, a table can have only one timestamp column. Therefore, the InsertTimestamp column is a binary(8) column that has a default of @@DBTS + 1. The example adds to the value that is returned by @@DBTS so that the UpdateTimestamp and InsertTimestamp columns have the same value after an insert is performed. If this is not done, it will look as if each row was updated after it was inserted.
The ID that Sync Framework creates for each client is a GUID; therefore, the two ID columns are uniqueidentifier columns. The columns have a default of 00000000-0000-0000-0000-000000000000. This value indicates that the server performed the update or insert. A later example includes a DeleteId column in the tombstone table.
Now that the columns have been added, the following example code adds indexes. These and other indexes in the example code are created on columns that are queried during synchronization. The indexes are added to emphasize that you should consider indexes when you implement change-tracking in the server database. Make sure that you balance server performance against synchronization performance.
Adding a Tombstone Table to Track Delete Operations
The following code example creates a tombstone table that has a clustered index and a trigger to populate the table. When a delete operation occurs in the Sales.Customer table, the trigger inserts a row in the Sales.Customer_Tombstone table. Before the trigger performs an insert operation, the trigger checks whether the Sales.Customer_Tombstone table already contains a row that has the primary key of a deleted row. This occurs when a row has been deleted from Sales.Customer, reinserted, and deleted again. If such a row is detected in Sales.Customer_Tombstone, the trigger deletes the row and reinserts it. The DeleteTimestamp column in Sales.Customer_Tombstone could also be updated.