Action Transform defines a data mapping that is applied inside the HVR pipeline. A transform can either be a command (a script or an executable), or it can be built into HVR. These transform happens after the data has been captured from a location and before it is integrated into the target, and is fed all the data in that job's cycle. To change the contents of a file as HVR reads it or to change its contents as HVR writes it, use action FileFormat with parameter CaptureConverter or IntegrateConverter. This Transform action happens between the changes from those converters.
A command transform is fed data in XML format. This is a representation of all the data that passes through HVR's pipeline. A definition is in HVR_HOME/etc/xml/hvr_private.dtd.
This section describes the parameters available for action Transform.
Name of the transform command. This can be a script or an executable. Scripts can be shell scripts on Unix and batch scripts on Windows or can be files beginning with a 'magic line' containing the interpreter for the script e.g. #!perl.
Argument path can be an absolute or a relative pathname. If a relative pathname is supplied the agents should be located in HVR_CONFIG/plugin/transform.
A transform command should read from its stdin and write the transformed bytes to stdout. If a transform command encounters a problem, it should write an error to stderr and return with exit code 1, which will cause the replication jobs to fail. The transform command is called with multiple arguments, which should be defined using parameter CommandArguments.
This parameter can either be defined on a specific table or on all tables (*). Defining it on a specific table could be slower because the transform will be stopped and restarted each time the current table name alternates. However, defining it on all tables (*) requires that all data must go through the transform, which could also be slower and costs extra resource (e.g. disk space for a Command transform).
Command Transform Environment
A transform command inherits the environment from its parent process. On the hub, the parent of the transform's parent process is the HVR Scheduler. On a remote Unix machine, it is the inetd daemon. On a remote Windows machine, it is the HVR Agent Listener service. Differences with the environment process are as follows:
Value(s) of parameter(s) for transform (space separated).
Unpack the SAP pool, cluster, and long text table (STXL). By defining this parameter, HVR can capture changes from the SAP pool, cluster, and long text tables (binary STXL data) and transform them into unpacked readable data. For information about the requirements for using this parameter, see section Requirements for SapUnpack.
The SAP Unpack license is required for using this functionality.
Additional information about SapUnpack
Execute transform on hub instead of location's machine.
Run transform in N multiple parallel branches. Rows will be distributed using hash of their distribution keys, or round robin if distribution key is not available. Parallelization starts only after first 1000 rows.
The value should be the name of a context (a lowercase identifier). It can also have form !context, which means that the action is effective unless context is enabled. One or more contexts can be enabled for hvrrefresh or hvrcompare (on the command line with option
Defining an action which is only effective when a context is enabled can have different uses. For example, if parameters Command="C:/hvr/script/myscriptfile" and Context=qqq is defined, then during replication no transform will be performed but if a hvrrefresh is done with context qqq enabled (option
Requirements for SapUnpack
HVR's SapUnpack feature allows capturing changes from SAP pool, cluster, and long text table (STXL) and replicate into target database as "unpacked" data. For example, HVR can capture the SAP cluster table (RFBLG) from an Oracle based SAP system and unpack the contents (BSEG, BSEC) of the cluster table into a Redshift database; the HVR pipeline does the 'unpacking' dynamically.
The SapUnpack feature supports capturing changes from SAP system with either of the databases - Db2 for i, Db2 for LUW, Oracle, and SQL Server.
The SAP database typically contains tables that fall into one of the following three categories:
- Transparent tables: ordinary database tables which can be replicated in a usual way.
- Pooled and Cluster tables: are special in that the data for several Pooled or Cluster tables are grouped and physically stored together in a single database table.
- SAP Catalogs: contain metadata and do not usually need to be replicated. HVR's SapUnpack feature needs data from the SAP catalogs for processing the Pooled and Cluster tables.
To enable replication from SAP database using SapUnpack, ensure that the SAP Dictionary tables (DD02L, DD03L, DD16S, DDNTF, DDNTT) exist in the source database. HVR uses the information available in SAP dictionary for unpacking data from SAP pool and cluster tables.
There are tables that SAP does not identify in its dictionary as Cluster tables even though the tables contain clustered data. These are not supported. Examples include PCL1, PCL2, and MDTC.
License for Unpacking SAP Tables
The regular HVR license does not enable the SapUnpack feature, so an additional license is needed. Contact HVR Support to obtain the proper SAP Unpack license.
HVR supports multi-licensing; this mean a hub system could have several licenses registered at the same time. For example, one license enables all features (except SapUnpack) of HVR to be used perpetually and another license enables the SapUnpack feature.