Using Change Data Capture to Augment your ELT/ETL Solutions

Reporting on near real-time data is a must in today’s landscape. To stay competitive, companies are now implementing more of an operational BI strategy for day-to-day tactical decision making to increase profits and react faster to changing attitudes. This cannot be accomplished by ETL tools on its own. Why? 

stale dataETL tools extract all or partial data directly from the tables which cause massive overhead, not only on the source systems, but the network infrastructure as well. Due to the large volumes of data that must be extracted in its raw state and moved, it needs to be done in ‘batch-windows’, typically during ‘off-peak’ or ‘nightly-batch window’ so not to impact the performance of other business operations. This leads to stale data that end-users or analyst’s report on, at least 24 hours old. This cannot work if, for example, a supermarket is monitoring on a specific price promotion on one of its brands, if it is not doing so well, it is impossible to make changes during the day if the data is not there – resulting in lost revenue and high margin costs.

Enabling Change Data Capture for Real-time BI with your ETL Tools

A function of HVR is to perform log-based Change Data Capture (CDC) on a wide variety of source RDBMS platforms such as Oracle, Microsoft SQL Server, IBM DB2, Postgres SQL, Ingres and many more. HVR moves only committed transactions from the transactional logs on the source systems, then uses a dense compression algorithm to condense the data further before streaming it to one or more required destinations in near real-time. This efficiency lends itself for ETL tools, like Informatica PowerCenter, IBM Infosphere DataStage, SnapLogic etc., to combine with HVR to replace the E and L in ELT/ETL to allow for timely access to the most up-to-date data for operational reporting.

HVR with ETL for End to End Data Integration

Change Data Capture for real-time BI with ETL

HVR can augment with ELT/ETL technologies in several different ways:

  1. Using HVR to directly load into database staging area
  2. Using the Agent Plugin API to create user logic to stream data into the ELT/ETL tool of choice. This method allows for HVR to orchestrate the ELT/ETL at a transactionally consistent point in time.
  3. Outputting to a CSV/XML file using HVR

The Power of ETL + Change Data Capture for Real-Time BI

This end-to-end data integration solution allows companies to trickle feed insert, update and delete operations continuously from multiple sources into a consolidated data warehouse/ enterprise data lake whereby the ELT/ETL tool of choice is able to consume and transform the data in ‘micro-batches’ periodically throughout the day, thus removing the notion of ‘nightly-batch window’ operations.

HVR has features to perform low-level transformation capabilities such as converting all the source transactions into inserts downstream. (e.g. update and delete operation convert into inserts into the target, to maintain a history of all changes that have occurred.) Additional metadata can be attached to these records to identify what the original operation type was, source commit-timestamp, originating system the transaction came from etc.

Want to see for yourself? We invite you to a trial of HVR.

About Zulf

Zulf is the Senior Solutions Architect at HVR. He has worked in the Data Integration space for over 20 years for companies such as Oracle and Pentaho.

© 2017 HVR Software

Free Trial Contact Us