Real-Time Data Replication Questions
The 10 Most Common Customer Questions About Real-Time Data Replication
Business users are demanding access to real-time data to improve decision making across the organization. Whether they’re looking to react quickly to customer needs, differentiate from the competition, or respond instantly to unplanned outages, end users want to base their choices on up-to-the-minute information.
As a result, organizations are undergoing a massive paradigm shift—switching from batch database operations to real-time data replication. In my role as Director of Enablement for HVR Software, I have a front-row view of the issues and questions DBAs face as they begin to implement real-time replication solutions for the first time.
10 Common Data Replication Questions
The following are the 10 most common questions I encounter. By perusing these questions, you’ll have a better idea of what to expect as you begin your real-time data replication journey.
- Why must I start with the fundamentals?
When customers start a training program, the first thing they generally ask my team to do is implement the data replication software for them in their production environment and then document what we did. That’s the opposite of what they need to do to succeed.
Best practice is to start with the fundamentals. Customers should gain an understanding of the solutions’ capabilities, common terminology, best practices for setup and configuration, and so on. Implementing for the customer’s use case will be much easier and more successful if they know the basics and leverage that knowledge to learn the more advanced capabilities necessary for their specific application.
- Why do I need to start with my initial use case?
Often, when customers are just starting out with their HVR implementation, they see all the databases we connect to, and all the topologies we support, and immediately start thinking about new ways they can use the solution.
For example, say, they originally purchased HVR to replicate data from an on-premises Oracle database to another on-premises Teradata system for offload reporting. Once they see what we do, they want to replicate their data to the cloud for high availability or use streaming technologies, such as Apache Kafka.
But it’s important for customers to stay focused on why they purchased the product and what platforms they purchased it for. Once they learn how to use the product for the initial use case, they can start thinking outside the box and testing it on additional use cases.
- What do you mean by “real-time”?
Most customers currently process their data in batches. They take production data and perform extract, transform, and load (ETL) operations to bring the data into a data warehouse on a periodic basis to make that data actionable and add value.
HVR uses change data capture (CDC) to move data more quickly with less pressure on resources. We can move only incremental changes rather than the entire tables, transforming operations from batch oriented into real time. With HVR near real-time data replication, it takes one to two seconds to route the data and apply it to the destination of their choice.
- How do I install the product and add new data sources and targets?
Everyone wants an easy installation process, and that’s what we deliver. Regardless of the database or types of files they’ll replicate, installation takes less than 10 minutes. Users can also quickly add new data sources and targets to build out a data warehouse quickly without having to install additional components.
- How can I configure my distributed architecture?
Many organizations have large, complex data sets with different types of data and systems—including structured, unstructured or streaming data, on-premises or cloud solutions, and many other options. They’re interested in exploring various ways to set up HVR’s distributed architecture.
With HVR, rather than having many peer-to-peer installations with configurations managed on different boxes, we have a single unified console that lets you centrally view and manage all replication channels.
Customers can run HVR agents on the source, on the target, or on the Hub only, implementing the best solution for their application. For example, running an agent on the source increases performance, while running the agent on the hub allows customers to accommodate cloud solutions that do not permit third-parties to install software in their environment.
- How does the HVR environment scale with high performance?
HVR employs technologies that allow it to scale while providing predictably high performance. CDC provides the most efficient way to access data from the source. Rather than storing data locally and then moving it to the destination as do competitive solutions, HVR stores data on the Hub, and then integrates the data where it’s needed. This reduces storage requirements on the production server, eliminating the requisite disk I/O that adds latency. Additionally, because HVR stores only incremental changes on the hub and compresses the data, it has a minimal impact on the network, boosting throughput.
- Can I do replication in both real-time and right time?
While HVR provides continuous streaming for data as it’s committed on the source as the default, some DBAs continue to demand batch processes. For example, they might need a window without replications in which to perform nightly backups. HVR accommodates this requirement by allowing DBAs to schedule jobs. HVR’s unified management console has built-in scheduling or you can use third-party scheduling tools as needed.
- How do you manage the HVR system?
Once your jobs are scheduled, you can let HVR monitor itself and notify the operator if any important event occurs. The HVR management console allows you to set up alerts that proactively notify you via email or Slack messages if anything falls behind or if replication discrepancies occur.
- What security capabilities do you offer?
We understand that security is more important when data is shared in the public cloud and offer security to protect this data. We use SSL or TLS to secure data in transit. Transparent data encryption secures data at rest in most locations including Amazon KMS and Azure Key Vault. When using an HVR Agent or an HVR Proxy to integrate data between data centers or between cloud and on-premises, HVR provides additional security by only requiring the firewall to open a single machine and port pair in a single direction.
- How do I know if the data HVR sent really made it over?
We offer data validation services to give you peace of mind that data has made it successfully from the source to the target. Our data validation functionality can compare all, or some, of the tables and tell you if they’re in synch. If the tables are not in synch, the system shows you how to put the data back into sync without having to reload it. You can use HVR to repair any data that’s out of synch with just a click of a button.
HVR gives you the flexibility to address most data replication use cases, and the industry expertise to help you implement the most appropriate topology and solutions. For more information, we invite you to contact us.