Now available—a fit-for-purpose data store built on an open data lakehouse architecture to scale AI workloads, for all your data, anywhere
Book a live demo Explore the interactive demo
IBM product screenshot of watsonx.data software interface
An open, hybrid and governed data store 

Watsonx.data makes it possible for enterprises to scale analytics and AI with a fit-for-purpose data store, built on an open lakehouse architecture, supported by querying, governance and open data formats to access and share data. With watsonx.data, you can connect to data in minutes, quickly get trusted insights and reduce your data warehouse costs. Now available as a service on IBM Cloud and AWS and as containerized software.

IBM acquires Manta to complement data and AI governance capabilities
Digital event

Join watsonx Day on December 6 for the latest watsonx updates

Get the ebook: The data store for AI
Scale AI workloads for all your data, anywhere watsonx.data solution brief Access all your data across hybrid-cloud

Access all data through a single point of entry with a shared metadata layer across clouds and on-premises environments.

Get started in minutes 

Connect to storage and analytics environments in minutes and enhance trust in data with built-in governance, security and automation.

Reduce the cost of your data warehouse by up to 50%¹ through workload optimization

Optimize costly data warehouse workloads across multiple query engines and storage tiers, pairing the right workload with the right engine.

Key features to access and share data across hybrid cloud
Fit-for-purpose query engines Provide fast, reliable and efficient processing of big data at scale through multiple engines, such as Presto and Spark. Read the Presto white paper

Built in governance, security and automation Ensure enterprise compliance and security using built-in unified governance or connect to existing solutions.

Vendor-agnostic data formats Utilize vendor-agnostic open formats for analytic data sets, including Apache Iceberg table format and Apache Hive metastore, allowing different engines to access and share the same data, at the same time.

Cost-effective, simple, object storage Store large volumes of data in low-cost object storage and share it through an open table format built for high performance analytics.

Hybrid cloud deployments Deploy seamlessly across any cloud or on-premises environments in minutes with workload portability through Red Hat OpenShift. Accelerate on-premises deployment and querying through integration with IBM Fusion HCI. Explore IBM Storage Fusion

Built on an open data lakehouse architecture

Combine the performance of data warehouses with the flexibility of data lakes, to address the challenges of today’s complex data landscape and scale AI. Optimize workloads from your data warehouse by choosing the right engine, for the right workload, at the right cost and modernize your ineffective data lakes with warehouse-like performance, security and governance.

Read more about data warehouse optimization

Read more about data lake modernization

Optimize workloads with fit-for-purpose query engines Drive analytics costs down with cost-efficient compute and storage and fit-for-purpose analytics engines—Presto, Spark, Db2, Netezza—that dynamically scale up and down.

Insights powered by generative AI - coming soon Watsonx.data leverages watsonx.ai foundation models to simplify and accelerate the way users interact with data. Use natural language to explore, augment and enrich data from a conversational user interface.

Share a single copy of data for analytics and AI Store vast amounts of data in vendor-agnostic open formats, such as Parquet, Avro and Apache ORC, while leveraging Apache Iceberg table format and shared metadata to share a single copy of data across multiple query engines.

Easy-to-use integrated console Connect to your existing analytics data across hybrid-cloud and deploy query engines in minutes. Explore and transform data using common SQL.

Ways to put your data to work

AI and machine learning at scale Build, train, tune, deploy and monitor trusted AI models for mission-critical workloads with data in the lakehouse and ensure compliance with lineage and reproducibility of data used for AI.

Real-time analytics and business intelligence Connect existing data with new data in minutes and unlock new insights without the cost and complexity of duplicating and moving data. Integrate with IBM Cognos and other third-party business intelligence and dashboarding tools to visualize data and insights.

Streamlined data engineering Reduce data pipelines, simplify data transformation and enrich data for consumption using SQL, Python or an AI-infused conversational interface.

Responsible data sharing Support self-service access for more users to more data, while enabling security and compliance with centralized governance and local automated policy enforcement through integration with IBM Knowledge Catalog.

Partner with us to deliver enhanced commercial solutions, embedded with watsonx.data, to better address clients’ needs.
Client stories Clients are using watsonx.data to capitalize on the value of their data across the hybrid-cloud and deploy AI workloads at scale Dot Group

“We’re excited to see how watsonx.data can easily bring together unstructured and structured data with trusted data access”

— Simon Parkinson, Managing Director, Dot Group


“IBM’s watsonx.data enables next-generation lakehouse architecture for data-driven enterprises. We believe watsonx.data capabilities will help enterprises to lower storage costs and optimize the compute while ensuring seamless data management capabilities across discrete systems to support all data engineering and analytics (AI/ML) needs.”

— Ashish Baghel, CEO and Founder, NuoData and NucleusTeq


“Organizations struggle with data accessibility and performance when developing the next generation of robust AI and ML models. By working with watsonx.data, we accelerate our clients’ connection to their data - whether on premises or at the edge - so they can gain trusted insights quickly by accessing all their data across their hybrid-cloud environments.”

— Chris Cochran, VP Alliances, WANdisco

Watsonx.data could allow us to easily access and analyze our expansive, distributed data ....and maximize our resource utilization to deliver superior user experiences... Vitaly Tsivin EVP Business Intelligence AMC Neworks
Better together: IBM watsonx.data + AWS
Available now

IBM watsonx.data and Amazon Web Services (AWS) are bringing even greater convenience and flexibility to analytics and AI in the cloud. Organizations can accelerate their data modernization strategy in the cloud by combining the openness, performance and governance of IBM watsonx.data for data, analytics and AI workloads with the scale, agility and cost efficiency of the AWS cloud infrastructure.

Learn more
Explore other watsonx components and related products watsonx.ai
Explore our new studio for foundation models, generative AI and machine learning.
Accelerate responsible, transparent and explainable AI workflows.
Db2 Warehouse
Run always-on, analytics workloads across the enterprise, now with watsonx.data integration.
Built for unified, high-speed analytics, anywhere, now with watsonx.data integration.
Learn more about our AI and data platform called watsonx. 
Take the next step

Get started with a free trial or request a live demo to see how you can put watsonx.data to work today

Start your free trial Book a live demo
More ways to explore Become an IBM Business Partner Community Explore AI consulting with IBM SaaS documentation Software documentation Subscribe to AI topic updates Support Explore the interactive demo

1When comparing published 2023 list prices normalized for VPC hours of watsonx.data to several major cloud data warehouse vendors. Savings may vary depending on configurations, workloads and vendor.

IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. See Pricing for more detail. Unless otherwise specified under Software pricing, all features, capabilities, and potential updates refer exclusively to SaaS. IBM makes no representation that SaaS and software features and capabilities will be the same.