Implement Type 2 Slowly Changing Dimension (SCD) with additional Capture Time column

Viewing 1 post (of 1 total)
  • Author
    Posts
  • #14037
    ggoodrich
    Keymaster

    Expert Note:

    Implement Type 2 Slowly Changing Dimension (SCD) with additional Capture Time column

     

    Scenario:

    In this example, the DBA has been asked to use HVR to replicate to a slowing changing dimension (SCD) table by adding an additional row to the target table anytime a row changes on the source, however they also want an additional column to be added to track when the change was captured.

    This How To is like the concepts explained in an Expert Note in the HVR Forum titled Audit Tables and Slowly Changing Dimensions. See https://www.hvr-software.com/support-services/customer-resources/forum/topic/audit-tables-and-slowly-changing-dimensions/ for more information.

    What is unique with this solution is that demonstrates adding a column to display a sequence number for each row updated along with updating a previous row on target with a column which is derived from the HVR captured timestamp.

     

    Example:

    The following example demonstrates populate several additional columns, including; SEQ_NO which acts as the surrogate key, and VALID_FROM and VALID_TO which are generated by HVR based on metadata obtained from the HVR capture timestamp. Let’s start with an example of data for customer 1001 named Rick.

    SEQ    CUST_ID            NAME               VALID_FROM    VALID_TO

    1         1001                     RICK                 03-MAY-2018     NULL

     

    When the row on the source is updated changing the name from RICK to RICHARD, you will see a new row added to the target with a new sequence number and setting of the VALID_FROM and VALID_TO columns. For example:

     

    SEQ    CUST_ID            NAME               VALID_FROM    VALID_TO

    1         1001                     RICK                 03-MAY-2018     04-MAY-2018

    2         1001                     RICHARD          04-MAY-2018    NULL

     

    Overview:

    Note: The following example assumes the reader already understands HVR fundamental concepts. It does not demonstrate building a channel nor provide examples of deploying the channel.

    The following channel definition uses:

    • ColumnProperties to define the additional columns to calculate and store SEQ_NO, VALID_FROM and VALID_TO
    • DbObjectGeneration to define scripts to create a stored procedure
    • DbProc to execute the procedure during integration

     

    Channel Definition:

    The following example shows the actions defined on the channel which instructs the Integrate module to call a stored procedures DM01_ORDER__II and DM01_PRODUCT__II to set the VALID_FROM and VALID_TO timestamps. The HVR Initialize step using the -S argument generates the stored procedures based on the contents of the include_sql.txt text file listed on the DbObjectGeneration action.

    It also shows setting the SEQ_NO column which is part of the surrogate key along with CUST_ID.

    The result is the HVR Integrate, through the use of the stored procedures, will generate a type 2 SCD row for each change with additional columns to set the valid from and to date based on the capture timestamp values, as well as updating the previous row with the new valid _to values.

    All changes are converted to inserts using the /TimeKey parameter of the Integrate action.

     

    Channel            Group               Table                Location                       Action

     

    o2oslow       SRC           *             *             Capture /NoBeforeUpdate

    o2oslow        SRC            *              *             TableProperties /Schema=testdb1

    o2oslow        TGT            *             *             ColumnProperties /Name=valid_from /Extra /IntegrateExpression={hvr_cap_tstamp} /TimeKey /Datatype=timestamp /Precision=6

    o2oslow       TGT           *             *             ColumnProperties /Name=valid_to /Extra /Datatype=timestamp /Precision=6 /Nullable

    o2oslow       TGT           *             *             ColumnProperties /Name=seq_no /Extra /IntegrateExpression=”(select NVL(max(seq_no),0)+1 from dm01_order where prod_id={prod_id} and ord_id={ord_id})” /Key /Datatype=integer /Length=3

    o2oslow       TGT           *             *             DbObjectGeneration /IncludeSqlFile=”C:/Users/hvruser/Downloads/include_sql.txt”

    o2oslow       TGT           *             *             Integrate /DbProc

     

    Sample contents of IncludeSqlFile named include_sql.txt:

     

    #if defined _INCLUDING_INTEG_DBPROC_END  &&  defined _TBL_NAME_DM01_ORDER && _HVR_OP_VAL == 1

    update “DM01_ORDER” set

    “VALID_TO” = “HVR_CAP_TSTAMP$”

    where “PROD_ID” = “PROD_ID$”

    and “ORD_ID” = “ORD_ID$”

    and “SEQ_NO” = (select max(seq_no)-1 from dm01_order where prod_id= “PROD_ID$” and ord_id= “ORD_ID$”);

    #endif

     

     

    This piece of code will be added to stored procedure DM01_ORDER__II which will be created by HVR Initialize when using the Integrate action /DbProc. Results of running HVR Initialize generates the stored procedure DM01_ORDER__II on the target database which looks like:

     

    procedure “DM01_ORDER__II” (

    “HVR_CAP_TSTAMP$” date default null,

    “PROD_ID$” number default null,

    “ORD_ID$” number default null,

    “CUST_NAME$” varchar2 default null,

    “CUST_ADDR$” varchar2 default null,

    “HVR_REFR$” number default 0,

    “HVR_CAP_LOC$” varchar2 default ‘ ‘,

    “HVR_CAP_USER$” varchar2 default ‘ ‘

    ) as

    “HVR_MSG_TXT$” varchar2(255);

    begin

    begin

    insert into “DM01_ORDER” (

    “PROD_ID”,

    “ORD_ID”,

    “CUST_NAME”,

    “CUST_ADDR”,

    “SEQ_NO”,

    “VALID_FROM”,

    “VALID_TO”)

    values (

    “PROD_ID$”,

    “ORD_ID$”,

    “CUST_NAME$”,

    “CUST_ADDR$”,

    (select NVL(max(seq_no),0)+1 from dm01_order where prod_id= “PROD_ID$” and ord_id= “ORD_ID$”),

    “HVR_CAP_TSTAMP$”,

    null);

    exception when others then

    if sqlcode = -60 then

    raise;

    else

    “HVR_MSG_TXT$”:= ‘F_JR0195_INTEG_INS:’ ||

    ‘ dbproc=”DM01_ORDER__II”‘ ||

    ‘ sqlerror=’ || sqlerrm(sqlcode) ||

    ‘ rowcount=’ || to_char(sql%rowcount);

    raise_application_error(-20001, “HVR_MSG_TXT$”);

    end if;

    end;

    if sql%rowcount <> 1 then

    “HVR_MSG_TXT$”:= ‘F_JR0195_INTEG_INS:’ ||

    ‘ dbproc=”DM01_ORDER__II”‘ ||

    ‘ sqlerror=’ || sqlerrm(sqlcode) ||

    ‘ rowcount=’ || to_char(sql%rowcount);

    raise_application_error(-20001, “HVR_MSG_TXT$”);

    end if;

    update “DM01_ORDER” set

        “VALID_TO” = “HVR_CAP_TSTAMP$”

        where “PROD_ID” = “PROD_ID$”

        and “ORD_ID” = “ORD_ID$”

        and “SEQ_NO” = (select max(seq_no)-1 from dm01_order where prod_id= “PROD_ID$” and ord_id= “ORD_ID$”);

    end;

     

    The portion above listed in red illustrates the code generated that handles updating the previous version of the row with the values for the valid_to column.

     

    Upon execution of HVR Integrate schedule, every change on the source is not applied, but instead it calls the stored_procedure mentioned in the /DbProc parameter. If you have multiple tables that all require similar logic, then you need to make sure to include that logic in sqlinclude.sql. For example:

     

    #if defined _INCLUDING_INTEG_DBPROC_END  &&  defined _TBL_NAME_DM01_ORDER && _HVR_OP_VAL == 1

    update “DM01_ORDER” set

    “VALID_TO” = “HVR_CAP_TSTAMP$”

    where “PROD_ID” = “PROD_ID$”

    and “ORD_ID” = “ORD_ID$”

    and “SEQ_NO” = (select max(seq_no)-1 from dm01_order where prod_id= “PROD_ID$” and ord_id= “ORD_ID$”);

    #endif

     

    #if defined _INCLUDING_INTEG_DBPROC_END  &&  defined _TBL_NAME_DM01_PRODUCT && _HVR_OP_VAL == 1

    update “DM01_PRODUCT” set

    “VALID_TO” = “HVR_CAP_TSTAMP$”

    where “PROD_ID” = “PROD_ID$”

    and “SEQ_NO” = (select max(seq_no)-1 from dm01_order where prod_id= “PROD_ID$” and ord_id= “ORD_ID$”);

    #endif

     

    Execution of HVR Initialize would have generated two stored procedures; DM01_ORDER__II and DM01_PRODUCT__II

     

    Generating Test Data:

    Here is an example of generating some test data and the results after integrating the changes through the HVR channel.

     

    SQL> insert into testdb1.dm01_order values (1,1,’insert 1′,’insert street 1′);

    SQL> update testdb1.dm01_order set cust_addr = ‘street update 1’ where prod_id = 1;

     

    Result on target now;

    SQL> select seq_no,valid_from,valid_to,cust_addr from dm01_order where prod_id = 1;

     

    SEQ_NO VALID_FROM                                                                  VALID_TO

    ———- ————————————————————————— —————————————————————————

    CUST_ADDR

    —————————————————————————————————-

    1 03-MAY-18 06.04.58.000000 PM                                                03-MAY-18 06.07.17.000000 PM

    insert street 1

     

    2 03-MAY-18 06.07.17.000000 PM

    street update 1

     

     

    Here are two more updates on source:

     

    SQL> update testdb1.dm01_order set cust_addr = ‘street update 2’ where prod_id = 1;

    SQL> update testdb1.dm01_order set cust_addr = ‘street update 3’ where prod_id = 1;

     

    Results on target;

    SEQ_NO VALID_FROM                                                                  VALID_TO

    ———- ————————————————————————— —————————————————————————

    CUST_ADDR

    —————————————————————————————————-

    1 03-MAY-18 06.04.58.000000 PM                                                03-MAY-18 06.07.17.000000 PM

    insert street 1

     

    2 03-MAY-18 06.07.17.000000 PM                                                03-MAY-18 06.10.34.000000 PM

    street update 1

     

    3 03-MAY-18 06.10.34.000000 PM                                                03-MAY-18 06.10.34.000000 PM

    street update 2

     

     

    SEQ_NO VALID_FROM                                                                  VALID_TO

    ———- ————————————————————————— —————————————————————————

    CUST_ADDR

    —————————————————————————————————-

    4 03-MAY-18 06.10.34.000000 PM

    street update 3

     

     

    Additional Reference:

    For an explanation of the various types of slowly changing dimensions, please refer to: https://en.wikipedia.org/wiki/Slowly_changing_dimension.

    For Expert Note in the HVR Forum titled Audit Tables and Slowly Changing Dimensions, please refer to: https://www.hvr-software.com/support-services/customer-resources/forum/topic/audit-tables-and-slowly-changing-dimensions/

Viewing 1 post (of 1 total)
  • The forum ‘Expert Notes’ is closed to new topics and replies.

© 2020 HVR

Test drive Contact us