Welcome to our Data Lake Resource Page
More organizations are adopting data lakes as part of their architecture for their low cost and efficiency in storing large volumes of data. Is your organization looking to implement a data lake? Want to learn more about best practices or trends?
Read on to learn more about technology necessary for your data lake project, considerations for your data lake, and hear directly from experts in the field about the benefits of data lakes, and much more.
On this page you will find:
- Data Lake Best Practice Guides
- The Five Best Practices For Implementing a Data Lake
- A Data Architects Guide to Building a Data Lake for Success
- Webinar: Unlocking Your SAP Data with HVR
- Webinar: Five Key Questions About Data Lakes
- Webinar: Deploying a Data Lake in the Cloud
- Webinar: Three Data Lake Deployments Using Different Technologies
- Webinar: Building and Maintaining a Data Lake for Big Data Analytics
- Data Lake Blog Posts
- Diving into the Data Lake
- HVR 5.2: Data Lake Release
- Data Lake or Data Warehouse
- The Future of Data Lakes Infographic
Have questions? Want to understand how HVR can help you with your data lake project? Visit our Data Lake Solution page or contact us to learn more!
Data Lake Best Practice Guides
Guide: The Five Best Practices For Implementing a Data Lake
Most database administrators and data architects want to learn why and how to impleent a Data Lake. A recent survey by Hortonworks found that roughly 50 percent of respondents were actively learning about how to capitalize on the benefits of a Data Lake while an additional 20 percent of respondents were already involved in Data Lake initiatives.
WHY ALL THE INTEREST? BIG DATA. Data Lake provide a single repository for storing massive amounts of all types of data—unstructured, semi-structured, and structured—in its native format. They grant access and insight into all this data without time-consuming preparation.
We created this guide for those who are looking to start a new Data Lake initiative. This guide describes how Data Lakes developed, benefits for your organization, the technology necessary to build a Data Lake and best practices for getting started.
Guide: A Data Architects Guide to Building a Data Lake for Success
Over the past four years, we’ve been involved in a number of data lake projects. As a result of working with customers on this common use case, we have learned a lot about what makes a data lake project successful. We have experience integrating data into the different types of technologies that work well as a data lake, and understand the nuances of data integration for each. As such, we see the following as the three common blueprints for data lakes: file systems, streaming pipelines, and scale out databases.
In this e-book, we will describe each type of data lake, its pluses and minuses in order to help you determine the best option for your project.
- Learn the three common blueprints for data lakes
- Find out which technology makes sense for your data lake
- e-Book includes a comparison guide to data lake technologies
- Get strategies for successfully integrating data into your data lake
Webinar: Unlocking Your SAP Data with HVR
In this webinar HVR’s CTO, Mark Van de Wiel, discusses the challenges of extracting data from SAP ECC and SAP HANA. He talks about why it’s important to have the ability to extract and leverage this data in continuous real-time for common use cases such as data lakes, data warehousing and consolidated reporting.
Webinar: Five Key Questions About Data Lakes
Many organizations have a serious interest in data lakes, at the moment, because of the business analytics and new data-driven practices that lakes promise. Yet, these organizations still aren’t quite ready to take a dive into a data lake. Whether they are unable to define standard structures, align and maintain business meanings, or create a governance strategy, these companies struggle to anticipate what truly lies beneath the surface of the data lake.
What you will learn:
- What a data lake is and what it isn’t
- What technology platforms, tools, designs, and architectures are involved
- Why you may need a lake, and what business value you can expect to get from it
- How to get started in a safe and value-adding way
- What the critical success factors are
Webinar: Deploying a Data Lake in the Cloud
In this webinar, HVR experts Joe deBuzna and Mark Van de Wiel in this provide practical, real-world knowledge about how to integrate data successfully into your cloud-based data lake. They address common cloud and data lake deployment concerns such as: integrating data from multiple sources, moving data securely, validating data and more.
This webinar also features a live demo of HVR’s latest Data Lake release to continuously move on-prem data into a Data Lake on S3, and programmatically validate data correctness.
What you will learn:
- Evolution of data integration that led to Data Lakes
- Common challenges when integrating data into a Data Lake and how to overcome them
- Best practices for integration into, out of and between Clouds
Webinar: Three Data Lake Deployment Examples Using Different Technologies
In this webinar, learn about technologies that have served as a data lake for some of the largest organizations in the world. These organizations have used different technologies and strategies for managing their data lake. At HVR, we help these organizations integrate their data from multiple sources into their data lake.
What you will learn:
- Three different technologies used for “data lakes”
- Considerations when deploying a data lake in the cloud
- How to continuously integrate data into your data lake
- How to create a data lake that can be trusted
Webinar: Building and Maintaining a Data Lake for Big Data Analytics
Many data-driven organizations are adopting data lakes to support data discovery, data science, and real-time operational analytics capabilities. In fact, one-third of DBTA readers are planning projects for 2017. The ability to inexpensively store large volumes of data from diverse sources and make that data readily accessible to workers and applications across the enterprise is a huge advantage for companies pursuing new types of analytics, especially those involving the Internet of Things and cognitive computing use cases.
However, building and maintaining a data lake to support new analytics applications involves a number of technical challenges:
- Data architecture
- Data integration data security and governance
- Data security
- Data governance
In this webinar, your hosts highlight common pitfalls, key best practices, and success stories in building and maintaining a data lake for big data analytics that you can trust.
Data Lake Blog Posts
Diving into the Data Lake
The concept of the enterprise data lake is one of the most talked about ideas in the modern data-warehousing world. (If you’re not already familiar with how these two differ, review our data lake vs data warehouse post.) It is also one of the most divisive concepts with analysts, vendors and users split on whether this approach is an analytics breakthrough or an enterprise-level garbage bin for data that will never be looked at again.
HVR 5.2: Data Lake Release
In the last 8-12 months, we have seen an increased interest in continuous integration into data lakes implemented on file systems like Hadoop and S3. When early data lakes may have been centered around sensor-generated data, logs, or social media, we see an increased interest in building solutions on top of data lakes also using data from traditional relational database applications like ERP systems.
Data Lake or Data Warehouse?
Over the last few decades we’ve seen an explosion in the use of Data Warehousing within the IT industry; it was all the rage for doing analytics and reporting. This was a popular concept as companies could perform cross analysis of their large amounts of data quickly and efficiently to support management’s strategic decision-making process.