Quick Start for HVR into Greenplum

From HVR
Jump to: navigation, search

This quick start guide helps you to get started with HVR for replicating data into Greenplum database.

To proceed with this replication you must have basic understanding about HVR's architecture and terminologies like Hub, Location, Channel, Location Groups, Actions etc.

The example here demonstrates how to replicate tables from one Oracle schema (source location) to a Greenplum database (target location).

Before proceeding with this example ensure that the requirements for using HVR with Oracle and Greenplum are met. For information about access privileges and advanced configuration changes required for performing replication in Oracle and Greenplum, see Requirements for Oracle and Requirements for Greenplum respectively.

Create Demo Schemas and Tables

This section describes the procedure for creating Oracle schemas and tables that are to be replicated.

Skip this section if you already have a source schema with tables which you plan to use for this replication and/or you do not want to create tables manually in the target schema.

For this demonstration, create one source schema (sourcedb) with two tables (dm51_product and dm51_order) and one target schema (targetdb). Also ensure to insert values into these tables. For sample SQL statements to create demo schemas and tables, see Create Test Schemas and Tables in Oracle.

Note: HVR automatically creates tables in target schema during Refresh (initial loading) and it is the recommended method for creating tables in the target schema. However, if you want to manually create tables in target schema, see section Create Tables in Target Schema.

Create Hub Database

This section describes how to create a hub database (schema) in Oracle. The hub database is a small database which HVR uses to control its replication activities. The hub database contains HVR catalog tables that hold all specifications of replication such as the names of the replicated databases, the replication direction and the list of tables to be replicated.

  1. Create the hub database (hvrhub) with password (hvr).
  2. create user hvrhub
    identified by hvr
    default tablespace users
    temporary tablespace temp
    quota unlimited on users; 

Grants/Access Privileges

This section describes the grants/access privileges required for the source schema, target schema, and hub database.

  1. Configure the privileges for source schema (sourcedb). For more information, see section Grants for Log-Based Capture Database (Oracle) in Requirements for Oracle.
  2. grant create session to sourcedb;
    grant create table to sourcedb;
    grant alter table to sourcedb;
    grant select any dictionary to sourcedb;
    grant select any transaction to sourcedb;
  3. Configure the privileges for target schema (targetdb). For more information, see section Grants for Compare, Refresh and Integrate in Requirements for Greenplum.
  4. Configure the privileges for hub database (hvrhub).
  5. grant create session to hvrhub;
    grant create table to hvrhub;
    grant create procedure to hvrhub;
    grant create trigger to hvrhub;
    grant execute on dbms_alert to hvrhub;

Download and Install HVR

HVR distribution is available for download at https://www.hvr-software.com/account/ or to request a trial version visit https://www.hvr-software.com/free-trial/

Install HVR on a hub machine. For details on installing HVR, see the respective Operating System sections:

After the installation, you can control HVR using HVR's graphical user interface (HVR GUI).

If the hub machine is Windows, then HVR GUI can be executed directly on the hub machine. To control HVR remotely from your PC, connect to the hub machine using Windows Remote Desktop Connection and execute HVR GUI on hub machine.

If the hub machine is Linux or Unix, then HVR GUI can be executed directly on the hub machine. However, application like X Server or VNC viewer must be installed to execute HVR GUI directly on Linux. To control HVR remotely from your PC, install HVR on the PC (with Windows or MacOS) and configure HVR Remote Listener on hub machine.

If the hub machine is Unix, then HVR GUI should be typically executed remotely from a PC to control HVR installed on hub machine. To do this, install HVR on the PC (with Windows or MacOS) and configure HVR Remote Listener on hub machine.

Note: HVR Remote Listener allows you to connect HVR GUI available on your PC with the remote HVR hub machine.

Enable Log-Based Capture

To enable log-base capture, configure the following:

  • The user name that HVR uses must be added to Oracle's group.
    • On Unix and Linux, this can be done by adding the user name used by HVR to the line in /etc/group that begins with dba.
    • On Windows, right–click My Computer and select Manage ▶ Local Users and Groups ▶ Groups ▶ ora_dba ▶ Add to Group ▶ Add.
  • The Oracle instance should have archiving enabled. Archiving can be enabled by executing the following statement as sysdba against a mounted but unopened database: alter database archivelog. The current state of archiving can be checked with query select log_mode from v$database. For more information, see section Configuring Redo Files and Archiving for Log-Based Capture in Requirements for Oracle.

Launch HVR GUI

This section describes how to launch HVR GUI on various Operating Systems.

  • On Windows and MacOS, double-click the HVR shortcut icon available on the desktop.
  • On Linux, double-click the HVRGUI file available in the directory <HVR extracted path>/bin or else execute the command hvrgui.
  • On Unix, execute the command hvrgui.
Note: Linux and Unix requires applications like X server or VNC viewer to execute HVR GUI.

Register Hub

This section describes how to connect HVR GUI to the hub database.

When you launch HVR GUI for the first time Register Hub window is displayed automatically. Register Hub can also be accessed from menu File ▶ Register Hub. Skip step 1 to 4 if you wish to execute HVR GUI directly on the hub machine

  1. Click Connect to HVR on remote machine.
  2. Note: To connect HVR GUI on a PC to the remote HVR hub machine, the HVR Remote Listener must be configured and running on the HVR hub machine.
  3. Enter name or IP address of the hub machine in Node.
  4. Enter port number (defined in the HVR Remote Listener of the hub machine) in Port.
  5. Enter the Login and Password for the hub machine. By default, this is the Operating System login credentials of the hub machine.
  6. Select Oracle in Class.
  7. Provide Database Connection details.
    1. Enter directory path for ORACLE_HOME. You can also click browse to select directory path.
    2. Enter Oracle System ID in ORACLE_SID or TNS credential.
    3. Enter user name of hub database in User. For example, hvrhub.
    4. Enter password for hub database in Password. For example, hvr.
  8. Click Connect.
  9. SC-Hvr-QSG Oracle RegisterHub.png

  10. Click Yes in the prompt dialog asking to create catalog tables in HVR hub database.
  11. Note: HVR displays this prompt when connecting to a hub database for the first time.

On connecting successfully to the hub database, the navigation tree pane displays the hub machine and the hub database. Location Configuration, Channel Definitions, and Scheduler are displayed under the hub database.

Create Locations

This section describes how to create locations in HVR GUI. Location is a storage place (for example, database or file storage) from where HVR captures (source location) or integrates (target location) changes.

  • Create source location (src) connected to source schema (sourcedb) in Oracle.
    1. In navigation tree pane, right–click Location Configuration ▶ New Location.
    2. Enter Location name and Description for the location.
    3. Select Oracle in Class.
    4. Provide Database Connection details. For more information on Database Connection fields, see section Location Connection in Requirements for Oracle.
      1. Enter directory path for ORACLE_HOME. You can also click browse to select directory path.
      2. Enter Oracle System ID in ORACLE_SID or TNS credential or RAC credential.
      3. Note: For RAC connectivity, ensure to provide remote machine connection details under Connection tab.
      4. Enter username of the source schema in User. For example, sourcedb.
      5. Enter password for source schema in Password.
    5. Click Test Connection to verify the connection to location database.
    6. Click OK.
    7. SC-Hvr-QSG Oracle NewLocation.png

  • Create target location (tgt) connected to target schema (targetdb) in Greenplum.
    1. In navigation tree pane, right–click Location Configuration ▶ New Location.
    2. Enter Location name and Description for the location.
    3. Select Greenplum in Class.
    4. Provide Database Connection details. For more information on Database Connection fields, see section Location Connection in Requirements for Greenplum.
      1. Enter the hostname or ip-address of the Greenplum server in Node.
      2. Enter the Port number on which the Greenplum server is expecting connections.
      3. Enter the name of the Greenplum Database.
      4. Enter username of the target database in User. For example, targetdb.
      5. Enter password for target database in Password.
    5. Click Test Connection to verify the connection to location database.
    6. Click OK.
    7. SC-Hvr-QSG Greenplum NewLocation.png

Create Channel

This section describes how to create a channel (hvrdemo) in HVR.

  1. In navigation tree pane, right–click Channel Definitions ▶ New Channel.
  2. SC-Hvr-QSG Oracle NewChannel.png

  3. Enter Channel name and Description for the channel in New Channel dialog.
  4. Click OK.

Create Location Groups

This section describes how to create location groups in a channel. The location groups are used for defining action on the location. Typically a channel contains two location groups - one for the source location and one for the target location. Each location group can contain multiple locations.

In this example, create one source location group (SRCGRP) and one target location group (TGTGRP).

  1. In navigation tree pane, click + next to the channel (hvrdemo).
  2. Create source location group (SRCGRP):
    1. Right–click Location Groups ▶ New Group.
    2. Enter Group Name and Description for the location group.
    3. Select source location (src) from Group Membership.
    4. SC-Hvr-QSG Greenplum NewSourceGroup.png

    5. Click OK.
  3. Create target location group (TGTGRP):
    1. Right–click Location Groups ▶ New Group.
    2. Enter Group Name and Description for the location group.
    3. Select target location (tgt) from Group Membership.
    4. SC-Hvr-QSG Greenplum NewTargetGroup.png

    5. Click OK.

Select Table(s)

This section describes how to select the tables (dm51_product and dm51_order) from source location for replication. Table Explore allows you to select schema(s) and/or table(s) for replication.

  1. Right–click Tables ▶ Table Explore.
  2. Select source location (src) from the list.
  3. SC-Hvr-QSG Greenplum TableExplore.png

  4. Click Connect.
  5. Select tables from Table Explore dialog. Press Shift key to select multiple tables or Ctrl+A to select all tables.
  6. Click Add to add the selected tables.
  7. Click OK in HVR Table Name dialog.
  8. Click Close in Table Explore dialog.
  9. SC-Hvr-QSG Oracle TableExploreSelectTables.png

Define Actions

This section describes how to define Actions on the location groups (SRCGRP and TGTGRP). Actions define the behavior of a replication activity.

  1. Define action Capture to capture changes from all tables in the source location group.
    1. Right–click source location group SRCGRP ▶ New Action ▶ Capture.
    2. SC-Hvr-QSG Oracle Action Capture.png

    3. Click OK.
  2. Define action Integrate to integrate changes into all tables in the target location group.
    1. Right–click target location group TGTGRP ▶ New Action ▶ Integrate.
    2. Select parameter /Burst. HVR’s Integrate with Burst improves the data replication performance by using the Burst algorithm, for more information, see /Burst in Integrate.
    3. SC-Hvr-QSG Greenplum Action Integrate.png

    4. Click OK.
  3. Define action LocationProperties to use the Greenplum Parallel File Distribution (gpfdist) server for bulk loading operations (Bulk Refresh and Integrate with Burst) into Greenplum for maximum performance. Using Greenplum’s External Tables with gpfdist data loading is the fastest way to load large tables into Greenplum . For more information, see section Bulk Refresh and Burst Integrate in Requirements for Greenplum.
    Note: Before proceeding with this step, ensure to set values for /StagingDirectoryHvr and /StagingDirectoryDb as mentioned in section Bulk Refresh and Burst Integrate.
    1. Right–click target location group TGTGRP ▶ New Action ▶ LocationProperties.
    2. Select parameter /StagingDirectoryHvr.
    3. Browse and select the directory for bulk load staging files. This directory should be on the machine where HVR connects to the source database
    4. Select parameter /StagingDirectoryDb.
    5. Enter the local directory on the Greenplum head-node or a URL pointing to /StagingDirectoryHvr.
    6. SC-Hvr-QSG SapXForm Action LocationPropertiesStaging.png

    7. Click OK.
Note: The Actions pane only displays actions related to the object selected in the navigation tree pane. Click on the channel name (hvrdemo) to view actions defined for all location groups in the selected channel.

Initialize

This section describes how to initialize the replication. HVR Initialize first checks the channel and creates replication jobs in the HVR Scheduler.

In this example, HVR Initialize creates one capture job (hvr_demo-cap-src) and one integrate job (hvr_demo-integ-tgt).

  1. Right–click channel hvrdemo ▶ HVR Initialize.
  2. Select Create or Replace Objects in HVR Initialize dialog.
  3. Click Initialize.
  4. SC-Hvr-QSG Greenplum Initialize.png

  5. Click OK.
  6. SC-Hvr-QSG Snowflake InitializeSuccess.png

  7. Click Close.

Click Scheduler node in navigation tree pane to view the capture and integrate jobs in Jobs tab.

For more information about initiating replication in HVR, see Starting Replication.

Start Scheduler and Create Windows Service

This section describes how to start the HVR Scheduler and create Windows service.

  1. Start Scheduler. In the navigation tree pane, right-click Scheduler ▶ Start.
  2. SC-Hvr-QSG Oracle SchedulerStart.png

  3. Click Create... in the prompt asking to create the service hvrscheduler_hvrhub.
  4. Select Local System Account ('SYSTEM') in Create Windows Service dialog.
  5. SC-Hvr-QSG Oracle SchedulerCreateWindowsService.png

  6. Click Create....

Start Capture Job

This section describes how to start the job for capturing changes from source location (src). By starting the Capture job in HVR Scheduler, HVR begins capturing all changes since the moment HVR Initialize was executed. This 'capture begin moment' can be modified using the option Capture Rewind available in the Advanced Options tab of HVR Initialize dialog.

  1. In the navigation tree pane, click Scheduler.
  2. Start capture job. In the Jobs pane, right-click capture job hvrdemo-cap-src ▶ Start.
  3. SC-Hvr-QSG Oracle StartCapture.png

  4. Click Yes in Start dialog.
Note: On starting the capture job (hvrdemo-cap-src) successfully, the status of the job changes from SUSPEND to RUNNING.

Refresh

This section describes how to perform initial load into the target database. HVR Refresh copies all existing data from source location (src) to the target location (tgt) and optionally creates new tables and keys in target location.

  1. In the navigation tree pane, right–click channel hvrdemo ▶ HVR Refresh.
  2. Select Create Absent Tables.
  3. SC-Hvr-QSG Greenplum Refresh.png

  4. Click Refresh.
  5. Click Yes to begin HVR Refresh.
    When the refresh is completed, the Refresh Result dialog displays the total number of rows replicated from the selected tables.
  6. SC-Hvr-QSG Oracle RefreshResult.png

  7. Click Close in Refresh Result dialog.
  8. Click Close in HVR Refresh dialog.

Start Integrate Job

This section describes how to start the job to integrate changes into the target location (tgt).

  1. In the navigation tree pane, click Scheduler.
  2. Start integrate job. In the Jobs pane, right-click integrate job hvrdemo-integ-tgt ▶ Start.
  3. SC-Hvr-QSG Oracle StartIntegrate.png

  4. Click Yes in Start dialog.
Note: On starting the integrate job (hvr_demo-integ-tgt) successfully, the status of the job changes from SUSPEND to RUNNING.

Verify Replication

This section describes how to verify HVR's replication activity.

  • View Log file. HVR creates separate log file for the hub, channel (hvrdemo), and for each replication jobs (hvrdemo-cap-src and hvrdemo-integ-tgt). This log file contains the details of the changes captured and integrated. To view the replication activity log,
    1. In navigation tree pane, click + next to the Scheduler.
    2. SC-Hvr-QSG ViewLog.png

    3. Click hvr_demo ▶ View Log to view the output of the jobs in Log tab for the selected channel (e.g. Log of channel hvrdemo).
    4. SC-Hvr-QSG LogResult.png

      Note: The directory path for HVR log files is displayed in the log tab.
    5. Update the value(s) in source location database. The Log tab will immediately display the details of capture job (hvrdemo-cap-src) and integrate job (hvrdemo-integ-tgt) to reflect the capture and integrate of this change.
  • HVR Compare allows you to verify the replication activity by comparing the data in source and target locations. To compare the source and target locations,
    1. Stop the Integrate job (hvrdemo-integ-tgt),
      1. In the navigation tree pane, click Scheduler.
      2. In the Jobs pane, right-click integrate job hvrdemo-integ-tgt ▶ Suspend.
      3. SC-Hvr-QSG Oracle SuspendIntegrate.png

      4. Click Yes in Start dialog.
    2. Update the value(s) in source location database.
    3. Execute HVR Compare,
      1. In the navigation tree pane, right–click channel hvrdemo ▶ HVR Compare.
      2. Select source location (src) on the left side and target location (tgt) on the right side.
      3. Select Row by Row Granularity in the Options tab.
      4. SC-Hvr-QSG Greenplum HVRCompare.png

      5. Click Compare.
      6. On completion, Compare Result dialog is displayed. If the State column displays Different, it indicates the data in source and target locations are not identical.
      7. SC-Hvr-QSG Oracle HVRCompareResult.png

      8. Click Close in Compare Result dialog and HVR Compare dialog.
    4. Start Integrate Job (hvrdemo-integ-tgt).
    5. Execute HVR Compare again (step 3).
      In Compare Result dialog, if the State column displays Identical, it indicates the changes are replicated successfully.
    6. SC-Hvr-QSG Oracle HVRCompareResultAfterIntegrate.png