Mark Van de Wiel

Log-Based Replication for SQL Server Always On Systems

microsoft-sql-server-logoAccording to DE-Engine’s list of the most popular database technologies, Microsoft SQL Server remains in the top three for worldwide deployment year after year, only behind Oracle and MySQL. With that, it’s no surprise that a significant percentage of business-critical applications are built on SQL Server database technologies. By definition, all business-critical applications require a strategy for business continuity and high availability.

One of Microsoft’s solutions to this business continuity requirement for databases is the SQL Server Always On availability groups (SQL Server AG). An availability group supports a failover environment for a discrete set of user databases, known as availability databases, that failover together. An availability group supports a set of read-write primary databases and one to eight sets of corresponding secondary databases.

Icon_databaseOptionally, secondary databases can be made available for read-only access and/or some backup operations. Many database administrators (DBA) open the secondary databases on stand by nodes for non-critical BI queries or data warehousing operations. Enterprises make such serious investment to SQL Server solutions because of the criticality of the data and the importance of the application to their core business.

With mainstream cloud technologies, data beholden to this locally deployed, on-premise Always On availability group cluster needs to be replicated into cloud-based data warehouses or cloud-based data lakes. In this blog post, I explore a few dimensions of database replication from SQL Server availability groups to non-SQL Server databases into the cloud.

Preparing your SQL Server AG for replication:

As a high availability solution, applications read/write to a primary database on an active node while such changes are replicated synchronously or asynchronously to secondary databases on the same or a separate standby node through windows clustering technologies. Sometimes your DBA will require you to source only from standby node.

cyber-security-pictureThis is because DBAs believe in granting minimum privileges only a few users for fears of unintended actions on the users’ part that could negatively impact their SLAs on database performance or uptime.

This is also true when replication users require elevated privileges like sysadmin on the capturing database. When your DBA does not grant access to the primary node and your replication has to be sourced from the stand by node, some replication solutions utilizing SQL server articles (publisher-distributor transactional replication model with objects under replication called as articles) to enable replication will not be able to support it, because, in an Always On SQL server cluster, secondary databases cannot act as a replication publisher.

Solutions like HVR offer a number of options when replicating from SQL Server AG. HVR allows you to set up articles or CDC tables on the primary database and allow logs to be read from the secondary node to alleviate any security concerns. In fact, HVR uses CDC tables—not to populate change data or to publish to distribution database—but to ensure the SQL Server logs enough supplemental information in the transaction logs.

To minimize the overhead of replication, HVR can disable/delete the agent job that would populate the CDC tables or the distribution database. When using log-based replication, one key consideration is the management of the log truncation point on the primary node. It can be managed by HVR or by a separate log back up process.

When HVR is configured using an Archive Log Only mode, SQL Server instances that use an Always On standby instance as the source allows you to control the purge frequency of those backup transactions logs independent of retention from the primary.

Whether it’s on-premises or in the cloud, HVR has no limitations on the possible combinations of destination databases. In addition, HVR has the ability to integrate a continuous stream or micro-batches depending on your integration requirements.

With built-in initial load capabilities, compression, encryption, compare and repair, HVR is built for the cloud.

Interested in learning more? Check out these SQL Server-related knowledge-base articles and blogs:

In addition, here’s a step-by-step guide on how to set up HVR to capture from an always-on availability group.

 

About Mark

Mark Van de Wiel is the CTO for HVR. He has a strong background in data replication as well as real-time Business Intelligence and analytics.

Test drive
Contact us