What are key factors when considering a data replication solution?
Is your car faster than mine?
Maybe it is, but if the speed limit is 65 or 70 miles per hour, does it really matter? Also note that there are dependencies. For example under what conditions do you measure? On a highway, or off-road? In dry, or wet weather? On an empty road, or zig-zagging in between lots of other cars? Also, how expensive was your car relative to mine? Etc.
Assuming both cars can easily meet the speed limit, isn’t it more relevant to talk about other aspects of driving such as fuel consumption at the speed limit and maintenance costs, or how much fun it is to drive the vehicle. One aspect of performance that may be relevant is acceleration: how long does it take to get to the speed limit?
Performance in Data Replication Solutions
- PERFORMANCE: The performance analogy in real-time data replication is whether one data replication product is faster than the other. The speed limit in this analogy is the transaction log generation rate. Can the data replication tool keep current with transaction log generation, or not? Dependencies include the transaction rate (lots of small transactions or fewer larger ones), the percentage of transaction log that is relevant for replication, how many database objects must be tracked, what data types are involved, etc. If two replication products can easily keep up with transaction log generation volume, isn’t it more relevant to talk about other aspects such as how many resources do products use (CPU, memory, IO) to perform at the “speed limit”, or how much effort does it take an administrator to implement the technology? The one aspect of performance that may be relevant is how long does it take to get to current if replication is running behind.
- SPEED: HVR is a replication technology that can capture and replicate transactions at amazing speed. The other day I ran a TPC-C workload on my local Oracle database at about 800 transactions per second that were captured and replicated into a SQL Server database with at most a couple of seconds of latency, all running on a 2 year old laptop. Is that impressive? Well… TPC transactions are tiny so they don’t generate a lot of redo. Nothing else was running on my database at the time. Data types in TPC-C tables are simple. And HVR kept up with the transaction generation volume so it wasn’t the bottleneck. So should performance be the ultimate measure when you have to decide what replication technology is good for your environment?
- RESOURCE CONSUMPTION: Think about resource consumption for a moment. Resource consumption includes CPU, memory, storage and network resources. The HVR engineers decided that it was best by default to trade CPU resources for network resources and always compress transactions across the wire. This is probably the right default for hybrid cloud environments or for replication across a wide-area network in which network bandwidth is typically severely limited. On the other hand it may not be the best default for replication within a data center on an extremely busy database system that already averages at 90% CPU utilization. Likewise HVR does not store transaction files on the source system saving storage and IO resources but potentially increasing latency. Etc.
- MAINTENANCE: Then think about the effort it takes to setup and maintain the environment. Of course there is the setup of real-time replication, but that is certainly not all. In my Oracle to SQL Server environment aspects like DDL generation and initial load are extremely important to get up and running quickly. Also, it is comforting to be able to compare these heterogeneous environments to know whether the databases are in sync. HVR provides capabilities like these out-of-the box within a single offering but not every replication tool does. On top of that HVR provides a GUI to setup and maintain real-time replication. A lot of these kinds of capabilities relate to the cost of implementation and ongoing maintenance.
The moral of this post: test your replication technology to ensure it can keep up with transaction log generation. Whilst you perform that test think about other aspects such as the amount of effort it takes to set up and maintain the environment, and how easy it is to work with the technology.
Want to take our data replication solution for a test drive? Request a trial now!