Questions about always on’s data synchronization method

Question

Hello I am curious about data synchronization of secondary in always on.

Secondary confirmed that the redo thread reads the log cache or redo log file and synchronizes it to the data file and index file. I have doubts about the synchronization method here.

Is replaying a redo log different from checkpoint?

I have seen information that checkpoint do not occur in secondary. Then, I think that in order to synchronize the data, the log will be read every time and the data will be written immediately, but I am curious about the details of how redo log replay is performed here. (How is the data in the redo log structured? Is the log left on a transaction basis and simply re-executed? I am curious about this internal aspect.)

I am curious as to whether the buffer cache is not used when synchronizing data. As far as I know, sql server writes simple data inputs to the buffer cache and synchronizes them to the data file when a checkpoint occurs. Since checkpoint do not occur in the secondary, data is not uploaded to the buffer cache, and I wonder if dirty pages do not occur as a result.
I am curious about what advantages there are to synchronizing data in a different way than primary.

My guess is that data that cannot be synchronized while a checkpoint occurs may be lost when a fail over occurs, so the redo log is replayed(?) to enable quick real-time synchronization.

I have a lot of questions about the internal workings of sql server. This is because we believe that successful service operation is possible only by understanding these details. This is a lot of questions, but I look forward to your answers. thank you

Answer

Yes, replaying a log is very different from a checkpoint. A checkpoint flushes modified pages to the data file on disk, no matter if the transaction is committed or not. Replaying the log on the other hand, updates the pages according to the information in the log file. And this is only committed transactions.

No, replaying the log does not use the buffer cache, but writes to the pages directly.

And there is no checkpoint on the secondary, because there are never any dirty pages. And there are never any dirty pages, because either the secondary is entirely inaccessible, or it is a readonly replica. But in neither cases can there be any updates.

I'm not sure what you mean with your last question. There is no synchronisation on the primary per se; the synchronisation process is exactly that one to keep the secondary in sync with the primary which is the master for the operation.

Answer

Hi，MINJAE KO

How is the data in the redo log structured? Is the log left on a transaction basis and simply re-executed

For a database configured as the AlwaysOn primary replica, SQL Server creates a worker thread called the Log Scanner. This thread is specifically responsible for reading log records from the log buffer or log file, packaging them into log blocks, and sending them to each secondary replica. Its continuous operation ensures that data changes on the primary replica can be propagated to the secondary replicas persistently.

On the secondary replica, there are also two threads that complete the corresponding data update actions, namely, Harden and Redo.

The Harden thread writes the log blocks sent by the primary replica's Log Scanner into the log file on the disk of the secondary replica (this process is called "hardening").

The Redo thread, on the other hand, reads the log blocks from the disk and translates the log records into data modification operations, completing them on the database of the secondary replica.

Once the Redo thread completes its work, the database on the secondary replica will be consistent with the primary replica. AlwaysOn uses this mechanism to maintain synchronization between the replicas. The Redo thread periodically communicates with the primary replica to inform it of its progress. This allows the primary replica to know how far the data divergence is between the two sides. These threads work independently to achieve higher efficiency.

The Log Scanner is responsible for transmitting log blocks without waiting for the Log Writer to complete log hardening. Once the log blocks are hardened on the secondary replica, it sends a message to the primary replica to notify that the data has been successfully transmitted, without waiting for the redo operation to complete. The design goal is to minimize the performance impact of the additional operations brought by AlwaysOn on normal database operations.

AlwaysOn has mechanisms in place for these threads to collaborate with each other. Whether data divergence is allowed between replicas is determined by the availability mode of AlwaysOn.

I am curious about what advantages there are to synchronizing data in a different way than primary.

The availability mode determines whether the primary replica needs to wait for a secondary replica to harden the transaction log records to disk before committing a transaction.

Asynchronous Commit Mode： Availability replicas using this availability mode are known as "asynchronous commit replicas." When a secondary replica is in asynchronous commit mode, the primary replica can commit transactions without waiting for confirmation from the secondary replica that the log has been hardened. This means that transaction commits on the primary database are not delayed by the secondary database. However, updates on the secondary database might lag behind the primary, and in the event of a failover, some data may be lost. Additionally, because the primary replica does not wait for acknowledgments from the secondary replicas, issues on the secondary replicas never affect the primary replica.

In asynchronous commit mode, the secondary replica will attempt to keep up with the log records of the primary replica. However, even if the data on the secondary and primary databases is effectively synchronized, the availability group will always consider the secondary database to be in a "SYNCHRONIZING" state (i.e., "not synchronized"), because theoretically, in asynchronous mode, the secondary database could fall behind at any point.

Synchronous Commit Mode：In synchronous commit mode, the primary replica waits for a confirmation from the synchronous commit secondary replica that it has hardened the log to disk before committing a transaction.

As long as the secondary replica has not confirmed the log hardening, transactions on the primary replica cannot be committed. This ensures that the data on both sides is always synchronized. As long as data synchronization is ongoing, the secondary database will remain in a "SYNCHRONIZED" state. Synchronous commit mode ensures that the data on a given secondary database is fully synchronized with the primary database.

However, this assurance comes at the cost of increased transaction commit latency on the primary database. In other words, synchronous commit mode prioritizes high availability over performance. To optimize the performance of Always On availability groups in synchronous commit mode,

Log Synchronization Steps

Connection ：The secondary replica establishes a connection to the primary replica through the primary's mirroring endpoint.
Request Data ：The secondary replica sends a request to the primary replica asking for log blocks. The secondary and primary replicas will negotiate an initial log block LSN position along with some other information.
Run Log Scanner ：On the primary replica, the Log Scanner working thread starts operating. The Log Scanner is responsible for transporting log blocks to the secondary replica.
Harden and Redo Log ： The secondary replica uses the Harden (Harden) thread and Redo (Redo) thread working threads to process the log blocks sent by the Log Scanner. The Harden thread hardens the log blocks to the secondary replica's disk, while the Redo thread replays the transactions recorded in the log on the secondary replica.
Feedback Progress ： Each time the secondary replica receives three messages from the primary replica, it sends a progress message back to the primary replica detailing the hardening and redo progress. If the secondary replica does not receive three messages within one second, a progress message is still sent back. The progress message sent to the primary replica includes information about which LSNs have been redone and hardened on the secondary replica.

Through the above process, the log that has been hardened on the secondary database will eventually catch up to the end of the log on the primary database. At this point, the state of the secondary database will be set to SYNCHRONIZED.

The time required for this synchronization phase mainly depends on how far behind the secondary database is from the primary database at the start of the session. Additionally, it's crucial to understand the difference between the redo thread and the harden thread, as this is important for understanding the transaction commit mechanism. The harden thread writes the log to the log file, whereas the redo thread is responsible for updating the actual data files with the data contained in the log. These two threads work independently in Always On.

Transaction Commit Actions

Commit Transaction ：Run the COMMIT TRAN command to commit the transaction on the primary replica.
Write to Local Log ：On the primary replica, the COMMIT TRAN command is written as a log record (at this point, the record is still in the primary database's log cache). The log writer thread on the primary replica then composes all log records up to the COMMIT command into a log block and writes it from the cache to the LDF file on disk. After the log is written to disk, the primary database begins waiting for a message from the secondary replica to confirm that the log has been successfully written to disk on the secondary database. Until then, the transaction commit operation remains in a waiting state.
Scan Log ：As the log block starts being written from cache to disk, it signals the Log Scanner working thread that "the log is ready to be sent to the secondary replica." The Log Scanner retrieves the log block from the cache and sends it to the Always On log block decoder (Log Block Cracker). The decoder searches the log for operations that require special handling, such as file stream operations, file growth, etc. The decoder sends these operations as a message to the secondary replica. Once the log block is decoded, the entire log block is also sent to the secondary replica as a message.
Process Log Block Message：The log block message is processed on the secondary replica. The harden thread is responsible for hardening the log to disk. The log is then saved to the log cache on the secondary database, and the redo thread retrieves the log block from the cache and begins performing the redo operations.
Feedback Progress ：Each time the secondary replica receives three messages from the primary replica, it sends a progress message back to the primary replica detailing the hardening and redo progress. If the secondary replica does not receive three messages within one second, a progress message is still sent back.

The progress message includes information about which LSNs have been hardened and which LSNs have been redone. Since the redo thread starts later than the harden operation, there might be more hardened LSNs than redone LSNs.

Complete Commit ：The primary database receives the confirmation message from the secondary replica, completes the transaction commit process, and sends a confirmation message to the client. | In synchronous commit mode, log blocks must be hardened to disk on both the primary and secondary replicas before a transaction can actually be committed on the primary database. However, it does not require the log to be fully redone on the secondary replica, which helps mitigate the performance impact on the primary replica.

Additionally, Always On periodically checks the health of each replica and the databases on each replica. If a replica cannot communicate normally, the connection between the primary and secondary replicas enters the DISCONNECTED state.

Best Regards,

Mikey Qiao

If you're satisfied with the answer, don't forget to "Accept it," as this will help others who have similar questions to yours.

Share via

Questions about always on’s data synchronization method

2 answers

Your answer