Quick Start for HVR into Kafka

From HVR
Jump to: navigation, search

This appendix shows how to set up an HVR channel (called hvr_demo01) to replicate from an Oracle Database into Kafka. In this example HVR will replicate from a schema inside a single Oracle instance on the hub machine into two Kafka locations, one with JSON format messages and other with Confluent's Schema Registry as its 'micro' AVRO format. The steps below start by creating new users and tables for HVR to replicate from. In a real live situation, tables and data are likely to already exist.

Before following this quickstart, please make sure the requirements have been met; see Requirements for Kafka.

Create Test Schema and Tables

  1. REDIRECT Create Source Schema and Tables in Oracle

Install HVR on the Hub

First read section Introduction which explains the HVR's terminology and architecture. In particular this explains the importance of a hub database.

Then install the HVR software on the hub machine by following the installation steps in section Installing HVR on Windows or Installing HVR on Unix or Linux. If the hub machine is a Unix machine then HVR can either be installed on a Windows PC (so the HVR GUI can run on the PC and connect to the Unix hub machine) or the HVR GUI can be run on the Unix hub machine and connect back to an X server running on a PC.

This Quickstart assumes the Oracle Database on the hub server is also the source database. Most real-time integration scenarios use log-based capture. To enable log-base capture, configure the following:

  • The user name that HVR uses must be in Oracle's group. On Unix and Linux this can be done by adding the user name used by HVR to the line in /etc/group that begins with dba. On Windows right–click My Computer and select Manage ▶ Local Users and Groups ▶ Groups ▶ ora_dba ▶ Add to Group ▶ Add.
  • The Oracle instance should have archiving enabled. Archiving can be enabled by running the following statement as sysdba against a mounted but unopened database: alter database archivelog. The current state of archiving can be checked with query select log_mode from v$database.

The current archive destination can be checked with query select destination, status from v$archive_dest. By default, this will return values USE_DB_RECOVERY_FILE_DEST, VALID, which is inside the flashback recovery area. Alternatively, an archive destination can be defined with the following statement: alter system set log_archive_dest_1='location=/disk1/arc' and then restart the instance.

Create the Hub Database

Create the hub database, in which the HVR GUI will store the channel definition. This is actually another user/schema in the Oracle instance.

$ sqlplus system/manager
SQL> create user hvrhub identified by hvr
 2  default tablespace users
 3  temporary tablespace temp
 4  quota unlimited on users;
 
SQL> grant create session to hvrhub;
SQL> grant create table to hvrhub;
SQL> grant create sequence to hvrhub;
SQL> grant create procedure to hvrhub;
SQL> grant create trigger to hvrhub;
SQL> grant create view to hvrhub;
SQL> grant execute any procedure to hvrhub;

Connect To Hub Database

Start the HVR GUI on a PC by clicking on the HVR GUI icon (this is created by the HVR Installer for Windows) or by running hvrgui on Linux.

First, Register the hub database: right–click on hub machines ▶ Register hub. Enter connection details.

SC-Hvr-RegisterHub Oracle remote 4343.png
In this example the hub is a machine called guam, where an INET daemon is listening on port 4343. See section Installing HVR on Unix or Linux for how to configure this.

For a new hub database a dialog will prompt Do you wish to create the catalogs?; answer Yes.

Create Oracle Location

Next create a location for the Oracle source database using right–click on Location Configuration ▶ New Location.

SC-Hvr-Location Oracle db1.png

In this example there is no need to check Connect to HVR on remote machine because testdb1 is on the same machine as the hub.
Ignore the Group Membership tab for now.

Create Kafka Locations

To create a Kafka location, right–click on Location Configuration ▶ New Location.

  • HVR creates a Kafka location that sends messages in JSON format by default.

SC-Hvr-Location Kafka.png

  • To create a Kafka location that sends messages in (micro) AVRO format, select Schema Registry (Avro) and provide the URL for Schema Registry.

Link=https://www.hvr-software.com/wiki/File:SC-Hvr-Location_Kafka_AVRO.png

Create Channel

The next step is to create a channel. For a relational database, the channel represents a group of tables that is captured as a unit. To create a channel, perform the following procedure:

  1. Right–click on Channel Definitions ▶ New Channel.
  2. Enter a channel name (for example hvr_demo01).
  3. Click OK to create a channel.

Create Location Groups

The channel must have two location groups (for example SRC and KAFKA). Perform the following procedure to create a location group:

  1. Right–click the Location Groups ▶ New Group.
  2. Enter a group name (for example SRC).
  3. Select the location (for example src) to add it as a member of the group.
  4. Click OK to save.

To create the second location group:

  1. Right–click the Location Groups ▶ New Group.
  2. Enter a group name (for example KAFKA).
  3. Select the location (for example kafjs and kafav) to add it as a member of the group.
  4. Click OK to save.

Create Tables

The new channel also needs a list of tables to replicate. Perform the following procedure to create tables:

  1. Right–click on Tables ▶ Table Explore.
  2. Choose the database location and click ▶ Connect.
  3. In the Table Explore window, select the table(s) and click Add.
  4. In new dialog HVR Table Name click OK.
  5. Close the Table Explore window.

Define Actions

The new channel needs actions to define the replication. To define actions:

  1. Right–click on group SRC ▶ New Action ▶ Capture.
  2. Right–click on group KAFKA ▶ New Action ▶ Integrate.
  3. Right–click on group KAFKA ▶ New Action ▶ ColumnProperties.
    1. Select /Name=op_val.
    2. Select /Extra.
    3. Select /IntegrateExpression={hvr_op}.
    4. Select /Datatype=number.
    5. Click OK to save and close the New Action window.
  4. Right–click on group KAFKA ▶ New Action ▶ ColumnProperties.
    1. Select /Name=integ_seq.
    2. Select /Extra.
    3. Select /IntegrateExpression={hvr_integ_seq}.
    4. Select /Timekey.
    5. Select /Datatype=varchar.
    6. Select /Length=32
    7. Click OK to save and close the New Action window.

SC-Hvr Channel Kafka.PNG
Note that the Actions pane only displays actions related to the object selected in the left–hand pane. So click on channel hvr_demo01 to see both actions.

Enable Replication with HVR Initialize

Now that the channel definition is complete, proceed to create the runtime replication system with HVR initialize.

  1. Right–click on channel hvr_demo01 ▶ HVR Initialize.
  2. SelectCreate or Replace Objects.
  3. Click Initialize.

SC-Hvr-Initialize demo01.png

After initializing the channel hvr_demo01, all new transactions that start on the database testdb1 will be captured by HVR (when its capture job looks inside the transaction logs).
HVR Initialize also creates two replication jobs under the Schedulernode.

Start Scheduling of Capture Job

Next, perform the following procedure to start scheduling of the capture job:

  1. Right-click the Scheduler node of the hub database.
  2. Click Start to start the scheduler.
    SC-Hvr-Gui-Scheduler Kafka.png
  3. Right-click the capture job group hvr_demo01-cap ▶ Start to start the capture job.
    SC-Hvr-Gui demo01 Start cap Kafka.png

The capture job inside the Scheduler executes a script under $HVR_CONFIG/job/hvrhub/hvr_demo01 that has the same name as the job. So job hvr_demo01–cap–src detects changes on database testdb1 and stores these as transactions files on the hub machine.

Perform Initial Load

Perform the following procedure for the initial load from OLTP database into Kafka using HVR Refresh.

  1. Right–click the channel hvr_demo01 ▶ HVR Refresh.
  2. Select the target locations Kafav and Kafjs.
    SC-Hvr-Gui-Refresh Kafka.png
    You can optionally set Parallelism for Tables.
  3. Click Refresh to begin HVR Referesh.

Start Scheduling of Integrate Job

Perform the following procedure to unsuspend the integrate job in HVR Scheduler to push changes into Kafka according to the defined schedule.

  1. Right-click the integrate job group hvr_demo01-integ ▶ Unsuspend to resume the integrate job.
    SC-Hvr-Gui demo01 Start int Kafka.png

The integrate job inside the Scheduler executes a script under $HVR_CONFIG/job/hvrhub/hvr_demo01 that has the same name as the job. So the jobs hvr_demo01–integ-kafav and hvr_demo01–integ-kafjs pick up transaction files on the hub on a defined schedule and create files on Kafka containing these changes.
A scheduled job that is not running is in a PENDING state.