What is a data lake?

Data lakes are next-generation hybrid data management solutions that can meet big data challenges and drive new levels of real-time analytics. Their highly scalable environment can support extremely large data volumes and accept data in its native format from a wide variety of data sources. Data lakes can help break down silos, enabling organizations to gain 360-degree views of information and conduct cross-department, office or regional analytics. They also enable adoption of modern technologies such as artificial intelligence (AI) and the Internet of Things (IoT).

IBM and Hadoop capabilities

IBM and Cloudera, better together

Improve data discovery, testing, ad hoc and near real-time queries, supporting predictive and prescriptive analytics for today’s AI. Use a single ecosystem of products and services benefiting from the combined IBM and Cloudera collaboration and investment in the open source community.

Ladder to AI with IBM and Red Hat

Build your enterprise-grade, open AI data and analytic platform, harnessing machine learning and disparate data to drive better data-driven decisions. Benefit from industry-leading security and portability across your hybrid and multicloud environment when accessing, storing and exploring data.

Data lake industry use cases

Icon representing the retail use case for data lakes

Retail

• Determine what a customer is likely to purchase online and provide recommendations

• Identify a customer’s “path to purchase” to understand buying patterns and conduct micro-targeted marketing

• Predict or proactively identify fraudulent activity from both inside and outside the organization

Icon representing the banking use case for data lakes

Banking

•  Predict the success or failure of discounts

• Pinpoint the “next product to buy” and promote that product to customers

• Identify which customers are likely to decrease their bank business and employ proactive marketing activities

Icon representing the hospitality and travel industry use case for data lakes

Hospitality and travel

• Track and predict customer preferences to guide proactive selling

• Improve the customer experience and boost brand loyalty through customization and personalization

• Conduct real-time pricing and analysis

Data lake capabilities

Photo representing use of a data lake to streamline data preparation

Streamline data preparation and access

Reduce the time and cost spent on data preparation in a data lake that stores data in its original format. Use semi and unstructured data and provide users with the tools for real-time, self-service access necessary to drive AI and IoT.

Photo representing use of a data lake to reduce IT and data warehouse costs

Reduce IT and warehouse costs

Use commodity hardware when building your data lake to drive unlimited scalability and decrease capital expenditures. Save additional costs when using the data lake as a repository for older data that would otherwise take up capacity in a more expensive data warehouse.

Photo representing use of data lakes to improve data-driven decisions

Improve data-driven decisions

Federate and analyze data from more sources for deeper insights and more accurate results. Data lake governance features help ensure data is relevant and trustworthy. Coupled with real-time analytics and AI capabilities, the data lake allows your organization to seize new opportunities as they unfold.

Build a solution that optimizes the potential of data Lake

Data lake resources

Ebook: Build a better data lake

Learn about best practices and potential pitfalls when integrating a data lake in your existing data infrastructure. Understand the importance of enterprise-grade security and governance when using a growing diversity of data.

Infographic: Connect more data from sources with a data lake

Discover the new types and sources of data that can be used by integrating data lakes into your existing hybrid data management strategy. Data lakes allow you to tap into unstructured data and generate insights from real-time ad hoc queries and analysis.

Blog: Hortonworks/Cloudera merger

The January 2019 merger of Hortonworks and Cloudera is expected to shape the future market for big data and analytics. Read how the continued strategic partnership between Cloudera/Hortonworks and IBM can benefit our mutual customers.

Engage with an expert

Schedule a no-cost, one-on-one call with an experienced IBM expert

Learn about the IBM products, solutions and services available to help you build and grow a successful data lake.