Quick Start for File Replication

From HVR
Jump to: navigation, search

This appendix shows how to set up an HVR channel (called hvr_demo31) to replicate files between directories. In real life, HVR would usually replicate between directories on different machines. Some of these directories would be reached via FTP, SFTP or SharePoint/WebDAV. But for simplicity, in this example HVR will replicate between three directories all created on the hub machine; files are captured from subdirectory /tmp/f1 and copied to /tmp/f2 and /tmp/f3. The channel is a 'blob' file channel, which means it has no table information and simply treats each file as a sequence of bytes without understanding their file format.

Before following this quickstart, please make sure the requirements have been met; based on the location type see Requirements for FTP SFTP and SharePoint WebDAV, Requirements for HDFS, Requirements for S3.

WD-Quickstart-for-File-Replication.png

Create File Locations

Create three directories to test replication:

$ mkdir /tmp/f1
$ mkdir /tmp/f2
$ mkdir /tmp/f3

Install HVR

First read section Introduction which explains the HVR's terminology and architecture. In particular this explains the importance of a hub database.
Then install the HVR software on the hub machine by following the installation steps in section Installing HVR on Windows or Installing HVR on Unix or Linux. If the hub machine is a Unix machine then HVR can either be installed on a Windows PC (so the HVR GUI can run on the PC and connect to the Unix hub machine) or the HVR GUI can be run on the Unix hub machine and connect back to an X server running on a PC.

Create the Hub Database

Create the hub database, in which the HVR GUI will store the channel definition. This can be an Ingres database, Oracle schema, SQL Server, DB2 (LUW), DB2 for I, Postgres or Teradata database. The steps to create an Oracle hub database schema are as follows:

$ sqlplus system/manager
SQL> create user hvrhub identified by hvr
 2  default tablespace users
 3  temporary tablespace temp
 4  quota unlimited on users;
 
SQL> grant create session to hvrhub;
SQL> grant create table to hvrhub;
SQL> grant create sequence to hvrhub;
SQL> grant create procedure to hvrhub; 
SQL> grant create trigger to hvrhub;
SQL> grant create view to hvrhub;
SQL> grant execute any procedure to hvrhub;
 
$ sqlplus
Enter user–name: / as sysdba
SQL> grant execute on dbms_alert to hvrhub;
SQL> exit;

See also Create the Hub Database in Ingres, Create the Hub Database in SQL Server

Connect to Hub Database

Start the HVR GUI on a PC by clicking on the HVR GUI icon (this is created by the HVR Installer for Windows) or by running hvrgui on Linux.

First, Register the hub database: right–click on hub machines ▶ Register hub. Enter connection details.

SC-Hvr-RegisterHub Oracle remote 4343.png
In this example the hub is a machine called guam, where an INET daemon is listening on port 4343. See section Installing HVR on Unix or Linux for how to configure this.

For a new hub database a dialog will prompt Do you wish to create the catalogs?; answer Yes.

See also: Connect to Hub Database (Ingres), Connect to Hub Database in SQL Server

Create Channel and Location groups

Next create three file locations (one for each replicated directory) using right–click on Location Configuration ▶ New Location.
SC-Hvr-Location FIle db1.png

In this example there is no need to check Connect to HVR on remote machine because /tmp/f1 is on the same machine as the hub.
Ignore the Group Membership tab for now.
Make locations for /tmp/f2 and /tmp/f3 as well.
Now define a channel using Channel Definitions ▶ New Channel.
SC-Hvr-LocationGroup demo01 CENTRAL fileLocation.png

The channel needs two location groups. Under the new channel: right–click on Location Groups ▶ New Group. Enter a group name (for instance CENTRAL).
Add location f1 as a member of this group by checking the box for f1.
Then create a second location group, called DECENTRAL that has members f2 and f3.
This is a 'blob' file channel so it has no table layout information.

Define Actions

The new channel needs two actions to indicate the direction of replication.

  • Right–click on group CENTRAL ▶ New Action ▶ Capture. If parameter /DeleteAfterCapture is checked, then HVR will remove files from the source directory after they are captured. Otherwise, the contents of the directory (and its subdirectories) will be copied and changes will be detected by monitoring the files' timestamps.
  • Right–click on Group DECENTRAL ▶ New Action ▶ Integrate.

SC-Hvr-Gui LogGroups demo36.png

Note that the Actions pane only displays actions related the objects selected in the left hand pane. So click on channel hvr_demo31 to see both actions.

Enable Replication with HVR Initialize

Now that the channel definition is complete, create the runtime replication system.
Right–click on channel hvr_demo31 ▶ HVR Initialize. Choose Create or Replace Objects and click HVR Initialize.
SC-Hvr-InitializeFinished demo01.png

From the moment that HVR Initialize is done, all changes in directory /tmp/f1 will be captured by HVR.
HVR Initialize also creates three replication jobs, which can be seen under the Scheduling node in the HVR GUI.

Start Scheduling of Replication Jobs

Start the Scheduler on the hub machine by clicking in the HVR GUI on the Scheduler node of the hub database.
SC-Hvr-Gui-Scheduler oracle.png

Next, instruct the HVR Scheduler to trigger the replication jobs.
SC-Hvr-Gui demo01 Start.png

The replication jobs inside the Scheduler each execute a script under $HVR_CONFIG/job/hvrhub/hvr_demo31 that has the same name as the job. So job hvr_demo31–cap–f1 detects changes in directory /tmp/f1 and stores these as transactions files on the hub machine. The other two jobs (hvr_demo31–integ–f2 and hvr_demo31–integ–f3) pick up these transaction files, and copy new or modified files to the two target directories.  

Test Replication

To test replication, copy any file into directory /tmp/f1:

$ echo hello > /tmp/f1/world

In the HVR log file you can see the output of the jobs by clicking on View Log. This log file can be found in $HVR_CONFIG/log/hubdb/hvr_demo01–cap–db1.
SC-Hvr-Gui demo01 viewlog.png


The job output looks like this:

hvr_demo31–integ–f3: Waiting...
hvr_demo31–cap–f1: Capture cycle 1 for 2 files (37 bytes).
hvr_demo31–cap–f1: Routed 150 bytes (compression=19.8%) from 'f1' into 2 locations.
hvr_demo31–integ–f2: Integrate cycle 2 for 1 transaction file (150 bytes).
hvr_demo31–cap–f1: Waiting...
hvr_demo31–integ–f2: Moved 2 files to 'C:\tmp\f2'.
hvr_demo31–integ–f2: Waiting...
hvr_demo31–integ–f3: Integrate cycle 2 for 1 transaction file (150 bytes).
hvr_demo31–integ–f3: Moved 2 files to 'C:\tmp\f3'.
hvr_demo31–integ–f3: Waiting...

This indicates that the new file has been replicated to directories /tmp/f2 and /tmp/f3. Look in these directories to confirm this.