Real-Time Reporting & Replication on SAP
Occasionally I see my wife running SAP reports. She needs the results for budget and planning purposes. I don’t know what reports she runs to get the information she needs, but I do know the SAP reports are very slow. Every single one of them. It appears to me that five minutes is the very minimum amount of time it takes to run any report.
As someone who works in IT with multiple different databases, five minutes to pull up a report is shockingly slow. What are you going to do during the five minutes you are waiting for the results you need? As an organization, consider the cost of employees waiting for results! Over time this cost may be even higher than the SAP’s implementation cost, which likely wasn’t low to begin with (and took a long time)…
SAP HANA for Faster Reporting
SAP has a strategy to address slow reports: SAP HANA®. SAP HANA is an in-memory database that makes use of modern CPU features to perform fast data processing. Unfortunately, your organization will have to pay dearly (again) to adopt SAP HANA (both for hardware – to run the SAP database in memory – but even more so for software), and not every organization is ready to do this. Although SAP HANA has been on the market for several years, the stability and level of maturity are still not quite up to the level of established databases such as Oracle, DB2, and SQL Server.
Given the performance and cost challenges, real-time (and fast) reporting out of SAP systems is a common use case that one would think every SAP client is interested in. SAP has, unfortunately, made this use case harder than we would like it to be for real-time replication technology. There are three different kinds of tables in an SAP system:
- Transparent tables, with a one-to-one mapping between the table definition in SAP and the database table.
- Pool tables that include data from multiple logical tables in SAP into a single database table, storing the bulk of the table data in a compressed and encoded format, unusable for regular database tools unless the data is decoded first.
- Cluster tables that combine data from a few logical tables in SAP into a single database table, also using compression and encoding for the bulk of the data.
The Future of SAP’s Reporting System
SAP has traditionally only certified ABAP code (SAP’s proprietary programming language) to extract data out of SAP tables, but the only way to do this is using bulk extracts. Bulk extracts put a significant load on the database, and ABAP processing puts a load on the SAP application servers. Bulk extracts don’t allow for near real-time data integration.
To support closer to real-time replication, SAP introduced its SAP Landscape Transformation technology (SLT). SLT introduces triggers on the database tables, storing changes in separate change tables, with capture joining the change table with the base table to get the current data values. Although SLT can provide close to near real-time replication, there are still challenges:
- Triggers slow down database transactions, because in addition to processing the database change itself, the transaction must now also record its changes.
- Data replication is not transactional because changes are pulled at different points in time, and to achieve low latency parallelism may be required.
- The overall overhead of low-latency replication is high, especially on (critical) high-volume SAP systems.
HVR supports log-based Change Data Capture (CDC) on multiple transaction processing databases, including databases running SAP as well as SAP HANA. Log-based CDC does not slow down database transactions and keeps track of transaction boundaries to replicate data in near real-time, maintaining transactional consistency.
HVR also supports the decoding of cluster and pool table data. To minimize the overhead on the SAP application and to avoid extra load on the application servers, the decompression and decoding are performed downstream, away from the SAP database, with no reliance on ABAP or BAPIs. To simplify the creation of a reporting system, data warehouse, or data lake on your SAP data, HVR integrates with the SAP Dictionary to obtain up-to-date definitions for SAP tables, including any custom Z columns.
In addition to target table creation, one time (initial) data load and incremental CDC with continuous integration, HVR also provides data validation between source and target for transparent, cluster and pool tables.
Note: SAP also provides SAP Replication Server, formerly known as Sybase Replication Server. SAP Replication Server does support log-based data replication from multiple transaction processing databases including Oracle, DB2, SQL Server, and Sybase. However SAP’s main focus for Replication Server since its acquisition of Sybase over a decade ago has been to migrate SAP ERP customers from non-SAP HANA to HANA. Replication Server does not support log-based CDC from SAP HANA, and also it does not support modern cloud-based targets like cloud-based file systems (S3, ADLS, GCS) or technology platforms like Snowflake.
Some organizations prefer an out-of-the-box solution that includes reports and dashboards. For these organizations, there are real-time SAP reporting solutions available from third-party vendors such as HVR’s partners Simplement and Teradata.
An upgrade to the Enterprise ERP system is not something that is performed overnight. In the future, I think we will see a lot more real-time reporting, data warehouse, and data lake use cases on SAP for many years to come.