HVR Runs in a Scalable Distributed Architecture
What is a distributed architecture?
A distributed architecture is the most efficient solution for moving data in complex environments. As is with HVR, there is a central point of control (aka “the hub”) and through the use of agents, changes take place as close to the source as possible for low impact and performance. Agents are optional with HVR, it depends on your goals as to whether or not to deploy them.
To better understand the benefits of an architecture using agents, we invite you to watch this video.
Benefits of a distributed architecture Include:
- Flexibility in how you design your environment
How it works
Log-Based Change Data Capture (CDC) takes place on or as close to the source server as possible. This is where relevant transaction data is extracted and compressed. The data is then sent across the wire to the central hub, the distributor. The hub guarantees recoverability and as needed queues the compressed transactions.
Separately from the capture, an integrate process picks up compressed changes for its destination that are sent to the target where they are unpacked and applied using the most efficient method for the target.
Distributed architecture for on-premise to cloud data integration scenario:
Example: Oracle, SQL Server and a data lake in Amazon Redshift:
An organization using on-premise Oracle and SQL Server databases as sources and a Data Lake in Amazon Redshift will be able to scale to many sources with capture running on the individual database servers. These servers send compressed (and encrypted) changes into the AWS cloud to be applied to Redshift. The changes into Redshift go through S3 and copy into Redshift tables, followed by set-based SQL statements on the target tables, so that on aggregate the analytical database can still keep up with the transaction load from multiple sources.
Data integration architecture: understanding agents
The question of whether or not to use an agent when performing data integration, especially around use cases with log-based Change Data Capture (CDC) and continuous, near real-time delivery, is common.
In this video, HVR’s CTO, Mark Van de Wiel goes into detail about:
- The pros and cons of using an agentless setup, versus an agent setup
- When to consider one over the other
- Two common distributed architectures using an agent setup