Integrating SAP with Snowflake: Webinar Recap
Managing data flows today and into the future
The unlimited scalability of Snowflake’s cloud data platform and HVR’s unique change data capture technology, provide customers with an effective solution for continuous, high-volume data replication and integration.
During the webinar, Achieving a 360 View of your Business—Integrating SAP Data with Snowflake, HVR’s CTO Mark Van de Wiel and Snowflake’s Sales Engineer, John Gontarz, shared the power of each solution and how, when used together, HVR can efficiently replicate data from SAP ERP into Snowflake for real-time data analytics. In addition, they discussed how their joint customer, Pitney Bowes, is leveraging HVR and Snowflake’s cloud data platform for analytics from SAP and Oracle.
The Power of Snowflake’s Cloud Data Platform
Snowflake’s elastic cloud-built data platform is a complete SQL database. It’s also a zero-management platform, which means there is no hardware to install, procure, or configure. It’s a platform for all of your data, structured and unstructured. Snowflake is a comfortable environment to work in; as a result, everyone from database administrators, data scientists, data engineers, and end-users can go in run analytics. Also, Snowflake enables secure, live sharing of data without having to export that data and transmit it via other methods.
Snowflake accepts a variety of data sources, including OLTP databases, enterprise applications, third-party data, web/log data, and IoT. All of this data can go into Snowflake as a central storage or persistence layer. Once the data is available, you can run standard data warehouse workloads—workloads that you would typically run on a traditional on-premises system. Many Snowflake customers store gigabytes and petabytes of data and even built their product with Snowflake as the underlying engine.
You don’t always know the value of your data until you start working with it.
SAP applications generate large amounts of very important data. It is next to impossible, as well as cost-prohibitive, to combine all your data in the transactional SAP system where the SAP data is being generated. In many cases, customers move SAP data into Snowflake as is, and process what they need, when they need to use it. All of this is possible because of Snowflake’s unique architecture.
Snowflake is a service that is available on the Google Cloud Platform, Amazon Web Service, and Azure. Regardless of what cloud Snowflake is running in, it’s the same Snowflake. The storage layer scales independently of the other layers, giving it infinite scalability. The second layer up is the compute layer, which is separate from the storage layer.
Each one of the compute clusters is isolated from any other compute cluster. This brings massive scalability to the platform. This helps when you’re not just working with SAP data, but combining it with data from Salesforce, Workday, or other third-party applications. Next, is the cloud services layer. This layer handles security, user role management, and transactional consistency.
HVR is a great solution for getting data out of SAP systems. HVR can extract data from various systems into a Snowflake data warehouse. After an initial load, HVR performs incremental change data capture from the SAP systems and pushes those changes Into Snowflake. As a result, you’re always going to have up-to-date data in your Snowflake tables based on what’s happening in the SAP system.
The Power of HVR
HVR moves high volumes of data to and from a variety of sources and targets for real-time reporting and analytics. For customers running massive workloads on their transaction processing databases, they need to make sure that all of their data arrives securely, efficiently, and quickly into the destination database. That’s where HVR comes in.
HVR’s central installation, which is referred to as the hub, connects to a repository database where all of the setup and configuration happens and where you connect to for log retrieval. For data replication, HVR is an end-to-end solution that starts by discovering table definitions from the source.
Once the table definitions are discovered, HVR creates compatible table definitions in the target database, ensuring that we don’t lose any precision along the way. Then we do the initial load, which is referred to by the refresh. Incremental log-based change data capture allows for continuous integration, keeping the data in sync in real-time.
HVR’s distributed architecture allows the load to spread and creates an environment that is manageable with many sources and/or multiple targets, and scalable despite high volumes. The distributed architecture also provides a unified approach of authenticating two systems, including certificate-based authentication as a two-step verification mechanism and a standardized approach to encrypting data on the wire.
During this ongoing process, customers can feel at ease about their transferred data because at any time they can validate whether the data in their source system matches the data in the destination leveraging HVR’s compare capabilities. Of course, all of that is rounded off with the graphical user interface and monitoring capabilities, and more!
SAP is very central to many organizations’ primary business processes. As a result, upgrades are slow, and organizations often stick with the deployed version of ECC for a very long time, rather than upgrade to S4/HANA.
Seventy-seven percent of the world’s transactional revenue touches an SAP system.
When ECC is not running on SAP HANA it will often have three different types of tables: transparent tables, cluster and pool tables, and even custom Z objects. HVR integrates with the SAP dictionary, to ensure it gets the exact current definition, including any custom Z-columns that were added. Then log-based change data capture is performed, the data is added, transferred, and decoded away from the SAP system. With this entire process, there is no ABAP or BAPIs involved. There is also no load on the SAP application servers.
If you’re on SAP today and you think, “well, how are we going to manage our data flows today as well as into the future?” With HVR, you can get your data into Snowflake for analytics, today as well as into the future, if and when your organization is ready to adopt the next releases of SAP.
Real-World Use Case: Pitney Bowes
Pitney Bowes is a leading commerce provider specializing in shipping software and systems, postal meters, office equipment, and enterprise software solutions for customer and address management. With over 13,000 employees, Pitney Bowes works with a large percentage of Fortune 500 companies.
While being in business for over 90 years, they faced many challenges. From a business and technical perspective, they needed to move from daily batches to real-time updates and consolidate on-premises data into a cloud-based data warehouse.
Often, we see a lot of organizations using the traditional data warehouse approach, with nightly batch jobs that don’t scale anymore, putting too much load on the source system. In a 24/7 economy, it’s inefficient to place a one-time load on the system. Plus, organizations need to be responsive to changes and provide real-time decisions. That’s why organizations like Pitney Bowes sought out a solution that could provide real-time updates with continuous data flow and low impact on the source systems.
Pitney Bowes also needed to consolidate multiple on-premises data sources in a cloud-based data warehouse. The sources were running on Oracle SAP ECC, with the gamut of transparent cluster and pool tables, and the Oracle Lease and Finance Modules (OLFM) were running on Oracle as well. For their targets, they chose to build a data warehouse in Snowflake in AWS and a data lake in S/3. They took advantage of HVR’s capability to capture changes once and then deliver to many systems.
In addition, Pitney Bowes deployed HVR in a distributed architecture, and they still have the HVR hub running on-premises. With this approach, they benefit from only needing to open up the firewall into AWS. They have the flexibility to take data out of the sources into the destination, using encryption on the wire, distributing the load so that ongoing change data capture changes get delivered incrementally into Snowflake, as well as, with the capture once, deliver multiple times into S3 separately as a data lake. Independently, HVR also uses S3 in the case of Snowflake on AWS for transient staging into Snowflake.
Pitney Bowes is a great use case, but you can also get hands-on experience with your own cloud-based instance. Take a Test Drive and start replicating data today!