Welcome to your guide!

Real-Time Data Replication Test Drive

Data Validation

With data replication in place, you expect systems to be in sync, but are they? How do you find out? And if they are not in sync, what operations were performed recently that may have caused the systems to go out of sync?

Compare

To identify whether your data is in sync, HVR provides a Compare capability. Like Refresh and ongoing data replication, the Compare capability is based on the Channel Definition. (i.e. any transformations as part of the Channel Definition are taken into consideration when you compare the data)

Bulk Compare

For the purpose of the Test Drive, we have you perform a Bulk Compare between your Oracle source, and MariaDB and PostgreSQL targets:

1. To compare the data for the orcl2mdbpg Channel, navigate to the Channel Definition orcl2mdbpg, and use the context menu anywhere within the Channel Definition to select HVR Compare.

2. At the top left, select Location: orcl.

3. At the top right select both Locations, mdb and pg.

4. Keep the default with all tables selected.

5. Below the table selector, leave the default set to Bulk Granularity.

6. Set Parallelism for Locations to 3.

7. Set Parallelism for Sessions to 3.

8. Click Compare at the bottom of the dialog to start an interactive comparison.

More on this topic

There are two modes to compare the data:

  • Bulk Compare: computing a checksum (hash) for every row based on all column values, and a resulting checksum across all rows.
    Bulk Compare is fast because the checksum is computed by the HVR agents in a distributed setup, and only the value of the checksum is sent to the hub to evaluate whether source and target are in sync. Bulk Compare will report the row count as part of the comparison, but of course, an identical row count is by no means sufficient to know whether data is in sync.
  • Row-wise Compare: comparing the data literally row by row, column by column. Row-wise Compare can report detailed differences and report exactly the state of the comparison in terms of number of inserts, updates and deletes required to get the target back in sync with the source.
    Row-wise Compare does require data to be sorted and sent across the wire to the target system (taking advantage of HVR’s compression and encryption).

HVR supports Compare between databases in a heterogeneous setup, as well as between a database and a file system like S3, HDFS, or one of the Azure data stores.

Next Steps

In all likelihood the result of this comparison, which takes a few seconds to compute, will show differences. Some tables will likely come out as identical, but others will be different. Notice that some tables are reported as different despite having identical row counts, while others have a different row count and are, of course, reported as different.

The reason for all of these differences is most likely in-flight changes that are not accounted for during the Bulk Compare. HVR can take in-flight changes into account during a Row-wise Compare. Close the Compare Results dialog but leave the Compare dialog open to make changes to the Compare run.

1. On the Options tab below the table selector, switch to Row by Row Granularity. Notice that as you switch, the option for Online Compare becomes enabled.

2. Change the option Parallelism for Locations back to None.

3. Select the option Online Compare and leave the default Compare tables once and combine differences with captured transactions selected.

This default method for Online Compare is fastest and most accurate, but only available if HVR is used to replicate the changes. I.e. you can also use HVR to compare data for environments that do not use HVR to replicate changes!

4. Switch to the Scheduling taband choose the option Generate Compare Event. Leave the default Start Immediately. (Live Compare is only available through events and not as an interactive option.)

5. Click Compare to start the run.

More on this topic

Notice that this Compare run starts the browser and you can see the Compare results being populated in real-time as the run progresses. This run will of course take a bit longer because (1) data must be sorted and transferred to the target, and (2) following an initial Compare, the identified differences have to be reconciled with transaction files during the Compare run to identify whether systems are exactly in sync.

 

6. If all is well, your systems should now be in sync. Navigate into the details of the jobs to verify this.

7. Close the browser window.

More on this topic

HVR Compare has lots of capabilities that are identical to HVR Refresh capabilities. The Scheduling tab provides options like scheduling and slicing, and the Contexts tab provides options identical to the options available on the Contexts tab for HVR Refresh.

Close the HVR Compare dialog.

Audit

As you interact with HVR, audit records are stored in a database table to enable auditing of HVR operations. The audit is useful to review recent changes, with some organizations requiring these kinds of audits for regulatory compliance, such as SOX.

Since by now you have already performed a lot of operations with HVR, you can go straight to the Audit function and review the audit records.

1. In the tree, navigate to your hub hvrhub and use the context menu to select Event Audit Trail.

Like with Topology, Statistics, and Event-driven Compare, the Event Audit Trail is viewed in a browser. You see recent Compare runs as well as interactions with the hub to start jobs and run the Refresh. You also see definition changes you made to the repository to define Actions, add Tables, create Location Groups, create the Channel and Define Locations.

Drill into some of the audit records to review the details. You will notice all details for every interaction have been logged.

Conclusion

Congratulations!

Reading this implies that you walked through the steps to set up data replication between three systems, running different database technologies, from scratch. You learned how to configure data replication using HVR, inspected the topology chart that resulted from your setup, and looked at the statistics for the ongoing replication. If there are issues with the replication then you know HVR can proactively send out notifications.

You also validated whether the databases are truly in sync and saw that along the way, HVR diligently kept an audit log of the operations you performed as you were making changes to the setup and implemented data replication.

You may now show what you achieved in this trial to your co-workers, play around with this setup for the remaining allocated time that this trial is available to you.

Contact us to see how we can help you in your environment.

Test drive
Contact us