IBM today announced the coming launch of IBM, a data store built on an open lakehouse architecture, to help enterprises easily unify and govern their structured and unstructured data, wherever it resides, for high-performance AI and analytics. The solution is currently in a closed beta phase and is expected to be generally available in July 2023.

What is will be core to IBM’s coming AI and Data platform, IBM watsonx, announced today at IBM Think. With watsonx, IBM will launch a centralized AI development studio that gives businesses access to proprietary IBM and open-source foundation models, to gather and clean their data, and a toolkit for governance of AI. will allow users to access their data through a single point of entry and run multiple fit-for-purpose query engines across IT environments. Through workload optimization an organization can reduce data warehouse costs by up to 50 percent by augmenting with this solution.[1] It also offers built-in governance, automation and integrations with an organization’s existing databases and tools to simplify setup and user experience.

Supporting the data management life cycle

According to IDC’s Global StorageSphere, enterprise data stored in data centers will grow at a compound annual growth rate of 30% between 2021-2026.[2] With increased data volumes comes increased data silos, operational costs, and regulatory pressures, which can lead to greater scrutiny and demand for improved business outcomes from data, analytics and AI investments.

This proliferation of data spans every industry, and organizations have an opportunity to turn it into actionable insights that can inform revenue strategies and enhance operational efficiencies.

“The media and entertainment industry has undergone a significant digital transformation, with viewers consuming content across different devices and platforms,” said Vitaly Tsivin, EVP Business Intelligence at AMC Networks. “ could allow us to easily access and analyze our expansive, distributed data to help extract actionable insights and maximize our resource utilization to deliver superior user experiences for viewers of AMC Networks’ curated, high-quality content.”

Notably, runs both on-premises and across multicloud environments. The solution will help businesses harness their increasingly siloed data and apply advanced AI and analytics to derive actionable insights, all while supporting robust data governance and observability throughout the data management life cycle.

Strong partnerships for even stronger solutions is engineered to use Intel’s built-in accelerators on Intel’s new 4th Gen Xeon Scalable Processors and open-source query engines such as Presto, the Velox acceleration library and Spark, to deliver rapid and reliable data processing for high performance SQL querying, reporting, business intelligence, and machine learning.

“We recognize the importance of and the development of the open-source components that it’s built upon,” said Das Kamhout, VP and Senior Principal Engineer of the Cloud and Enterprise Solutions Group at Intel. “We look forward to partnering with IBM to optimize the stack, achieving breakthrough performance through our joint technological contributions to the Presto open-source community.”

IBM and Intel have a long history of collaboration on data and AI products, including the optimization of IBM Db2 on Intel Xeon platforms, AI acceleration with IBM Watson NLP Library for Embed with OneAPI, and now will allow users to modernize their data repositories with data warehouse-like capabilities, while benefiting from low-cost object storage and open data and table formats like Iceberg, to help them make data-driven decisions.

“Open data lakehouse architectures powered by the Apache Iceberg table format give organizations the flexibility to use fit-for-purpose analytical solutions to future-proof their data platforms for all workloads,” said Paul Codding, EVP of Product Management of Cloudera. “IBM and Cloudera customers will benefit from a truly open and interoperable hybrid data platform that fuels and accelerates the adoption of AI across an ever-increasing range of use cases and business processes.”

IBM and Cloudera have a long-standing strategic partnership that includes certified product integrations and joint sales and support models. will be available on premises and across multiple cloud providers, including IBM Cloud and Amazon Web Services (AWS). This builds on last year’s announcement of IBM expanding their relationship with AWS to offer IBM software as a service on AWS. The solution will also be available in AWS Marketplace.

“Organizations are increasingly adopting data lakehouse solutions to support their growing data needs, especially as we see an industry-wide shift toward AI solutions,” said Soo Lee, Director Worldwide Strategic Alliances at AWS. “Making available as a service in AWS Marketplace further supports our customers’ increasing needs around hybrid cloud – giving them greater flexibility to run their business processes wherever they are, while providing choice of a wide range of AWS services and IBM cloud native software attuned to their unique requirements.”

The coming launch of will extend IBM’s market leadership in data and AI, most recently demonstrated by its evaluation as a leader in The Forrester Wave: Data Management for Analytics, by integrating with existing IBM solutions like StepZen,, IBM Watson Knowledge Catalog, IBM zSystems, IBM Watson Studio, and IBM Cognos Analytics with Watson. These integrations can enable users to implement various industry-leading data catalog, lineage, governance, and observability solutions across their data ecosystems.

Beyond launch, is expected to undergo continuous development, incorporating the latest performance enhancements to the Presto open-source query engine via Velox. Further development of will also incorporate IBM’s Storage Fusion technology to enhance data caching across remote sources as well as semantic automation capabilities built on IBM Research’s foundation models to automate data discovery, exploration, and enrichment through conversational user experiences.

Explore the interactive tour of

Statements regarding IBM’s future direction and intent are subject to change or withdrawal without notice and represent goals and objectives only.

[1] When comparing published 2023 list prices normalized for VPC hours of to several major cloud data warehouse vendors. Savings may vary depending on configurations, workloads and vendors.

[2] IDC, Worldwide Global StorageSphere Forecast, 2022–2026: An Installed Base of 7.9ZB of Storage Capacity in 2021 Came at a Cost of $370 Billion — Is It Enough? (IDC Doc #US49051122, May 2022)

Was this article helpful?

More from Artificial intelligence

How to establish lineage transparency for your machine learning initiatives

3 min read - Machine learning (ML) has become a critical component of many organizations' digital transformation strategy. From predicting customer behavior to optimizing business processes, ML algorithms are increasingly being used to make decisions that impact business outcomes. Have you ever wondered how these algorithms arrive at their conclusions? The answer lies in the data used to train these models and how that data is derived. In this blog post, we will explore the importance of lineage transparency for machine learning data sets…

Accelerating the Java application lifecycle with generative AI and automation

3 min read - In today’s digital world, organizations are continuously developing, enhancing, upgrading and modernizing Java applications as part of their hybrid cloud strategy. While these are common development activities, they are often wrought with challenges, especially when working with complex enterprise applications that are monolithic, poorly documented or laden with technical debt. By harnessing the power of generative AI and automation, organizations have an opportunity to significantly reduce costs, decrease risk and improve time to value for development teams working with enterprise…

A new era in BI: Overcoming low adoption to make smart decisions accessible for all

5 min read - Organizations today are both empowered and overwhelmed by data. This paradox lies at the heart of modern business strategy: while there's an unprecedented amount of data available, unlocking actionable insights requires more than access to numbers. The push to enhance productivity, use resources wisely, and boost sustainability through data-driven decision-making is stronger than ever. Yet, the low adoption rates of business intelligence (BI) tools present a significant hurdle. According to Gartner, although the number of employees that use analytics and…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters