SQL Azure Database: Sync Framework

Credits

Author

Dinakar Nethi

Contributors:

Sean Kelley

Technical Reviewers:

Michael Thomassy, Liam Cavanaugh

Summary

SQL Azure Database is a cloud database service from Microsoft. SQL Azure provides web-facing database functionality as a utility service. Cloud-based database solutions such as SQL Azure can provide many benefits, including rapid provisioning, cost-effective scalability, high availability, and reduced management overhead. This document is not intended to provide comprehensive information on SQL Azure Data Sync, rather the intent is to provide best practices & guidelines on synchronizing SQL Azure with SQL Server and to supplement the information available at the links in the References section.

Sync Framework

Microsoft Sync Framework is a data synchronization platform that can be used to synchronize data across multiple data platforms.  There is a provider for each different type of database. A key aspect of the Sync Framework is the ability to create custom providers. Providers enable any data sources to participate in the Sync Framework synchronization process, allowing peer-to-peer synchronization to occur. To include support for SQL Azure, the Microsoft Sync team released the Microsoft Sync Framework Power Pack for SQL Azure which includes a provider for SQL Azure and a Data Sync tool for SQL Azure.

The SqlAzureSyncProvider is an optimized SQL Azure provider that handles some of the complexities involved in handling transactions with SQL Azure by using Table-valued parameters to reduce round trips to the server and also by using an intelligent “back-off-algorithm” to scale down the batch size of BCP during synchronization when it hits the throttling levels.

The Data Sync tool is a wizard that helps connect between SQL Server and SQL Azure and automate the synchronization process.

Steps in setting up Synchronization

Following are the steps in setting up a Synchronization between an on-premise SQL Server and SQL Azure.

  • Download and install the Microsoft Sync Framework Power Pack for SQL Azure November CTP
  • Also, download the Walkthrough Microsoft Sync Framework Power Pack for SQL Azure November CTP.doc from the above location. The document has step-by-step instructions in setting up Synchronization between an on-premises SQL Server and SQL Azure.

For larger databases, you may see faster synchronization times if you create multiple scopes. Since each scope has one thread allocated to it, creating multiple scopes allows for parallel data migrations.

Scopes in Sync Framework

A scope is the combination of tables and filters. For example, you could define a filtered scope named Sales-WA that contains only the sales data for the state of Washington from the tablecustomer sales. If you define another filter on the same table, such as Sales-OR, this is a different scope. If you define filters, Sync Framework does not automatically handle the deletion of rows that no longer satisfy a filter condition. For example, if a user or application updates a value in a column that is used for filtering, a row moves from one scope to another. The row is sent to the new scope that the row now belongs to, but the row is not deleted from the old scope. Your application must handle this situation.

Guidelines for efficient Scoping

  • Each scope has one thread allocated to it from the OS. So distributing the tables across multiple scopes will help parallelize the data migrations
  • Put static/changing at a very low rate tables in one scope and reduce their sync frequency
  • Group frequently changing tables in different scopes
  • Put logically related tables (Primary Key-Foreign key dependency or logical dependency) in one scope
  • Scopes that are only read on the client should be marked as a download only as this streamlines the sync workflow and decreases sync times
  • It is better to minimize the number of clients that are in each scope with the best case being a different scope for each client.  This minimizes contention on the server and is ideal for the hub-spoke case where all changes flow through a single server vs. being synced between clients

Initialize via snapshots vs. full initialization wherever possible to improve initial sync time by an order of magnitude

Architecture

SQL Azure Data Sync uses a custom data provider for the Microsoft Sync Frame synchronization engine written especially for SQL Azure, named "SqlAzureSync Provider." This new provider performs efficiently, lowers the barrier to entry, and ensures reliability when synchronizing with SQL Azure by intelligently handling some SQL Azure-specific complexities that occur in multi-tenant systems. Specifically, the provider decreases the number of round-trips to the server by using table-valued parameters (TVPs) to apply changes. In addition, When SQL Azure uses its throttling mechanism to minimize the impact of run-away operations, SqlAzureSyncProvider responds by using a “back-off algorithm” which automatically reduces batch sizes from the default of 5,000 rows during synchronization. A helpful side-effect of the use of this algorithm is that changes are viewable before synchronization is complete, and synchronization progress can be displayed in real time.  

See Also

  • [[SQL Azure Overview]]
  • [[SQL Azure Data Sync Overview]] 

Other Languages

This article is also available in other languages, including Russian.

Back to Top