Home AI and ML watsonx watsonx.data watsonx.data

Now available, a fit-for-purpose data store built on an open data lakehouse architecture to scale AI workloads, for all your data, anywhere

Start your free trial Explore the interactive demo
An open, hybrid and governed data store

IBM® watsonx.data™ enables you to scale artificial intelligence (AI) and analytics with all your data, wherever it resides, through: 

  • Open formats to access all your data through a single point of entry and share a single copy of data across your organization and workloads, without needing to migrate or recatalog.
  • Fit-for-purpose query engines to optimize your data workloads.
  • An integrated vector database to prepare your data for retrieval augmented generation (RAG) and other AI use cases.
  • An embeddable, AI-powered semantic layer to accelerate data access and unlock new data insights, no SQL required.
  • Integration with databases, tools and modern data stacks to maximize your existing data investments.
  • Hybrid deployment options to deploy across any cloud or on-premises environment in minutes.
Exciting IBM watsonx.data updates - now GA!

Major performance updates, new integrations to unlock your existing data investments, and enhanced features to manage and deliver data for AI.

Live demo

Book a meeting with watsonx.data product specialists

Get the ebook: The data store for AI

Use cases

Explore the ways to put your data to work. 

Data store for generative AI Unify, curate and prepare data efficiently for AI models and applications. Integrated vectorized embedding capabilities enable RAG use cases at scale across large sets of your trusted, governed data. Read more about preparing and delivering your data for AI

Cost-optimize your data warehouse Optimize workloads from your data warehouse by choosing the right engine for the right workload, at the right cost.  Read more about data warehouse optimization Watch the webinar

Modernize your data lake Extract more value from your data by modernizing ineffective data lakes with warehouse-like performance, security and governance. Read more about data lake modernization

Unlock mainframe data for AI Transactional data represents the current state of the business for many organizations and can provide unique predictive value for AI insights. Unlock the power of your transactional mainframe data for gen AI through integration with IBM Data Gate for watsonx. Learn more about unlocking your mainframe data for AI

Generative AI- powered data insights Embed a gen AI-powered semantic layer into watsonx.data to easily find, semantically enrich, and understand previously cryptic structured data in natural language through semantic search and unlock data insights faster. No SQL required.

Benefits Scale AI workloads for all your data, anywhere. Download the watsonx.data solution brief Access your data across hybrid cloud

Connect to storage and analytics environments in minutes and access all your data through a single point of entry with a shared metadata layer across clouds and on-premises environments.

Use quality data for AI

Unify, curate and prepare data efficiently for AI models and applications of your choice. Empower your AI with your trusted data.

Reduce your data warehouse costs

With workload optimization across multiple query engines and storage tiers, minimize the cost of your data warehouse by 50% and pair the right workload with the right engine.1


Key features to access and share data across the hybrid cloud.

Fit-for-purpose query engines Provide fast, reliable and efficient big data processing at scale through multiple engines, such as Presto, Presto C++, and Spark. Read the Presto white paper Get the ebook: Learning and operating Presto

Built-in governance, security and automation Help ensure enterprise compliance and security that uses built-in unified governance or connects to existing solutions including IBM Knowledge Catalog.

Vendor-agnostic data formats Use vendor-agnostic open formats for analytic data sets, including Apache Iceberg table format and Apache Hive metastore. This enables different engines to access and share data simultaneously.

Cost-effective, simple object storage Store large volumes of data in low-cost object storage and share it through an open table format built for high-performance analytics.

Integrated vector database Use your trusted and governed data in RAG and other machine learning and AI use cases. Unify, curate and prepare vectorized embeddings to enhance the relevance and precision of your AI with your data.

Hybrid cloud deployments Seamlessly deploy across any cloud or on-premises environment in minutes with workload portability through Red Hat® OpenShift®. Accelerate on-premises deployment and querying through integration with IBM® Storage Fusion HCI. Explore IBM Storage Fusion HCI Read the IBM Redbook


Combine the performance of data warehouses with the flexibility of data lakes to address the challenges of today’s complex data landscape and scale AI. Optimize workloads from your data warehouse by choosing the right engine for the right workload at the right cost, and modernize your ineffective data lakes with warehouse-like performance, security and governance. 

Optimize workloads with fit-for-purpose query engines Drive data workload costs down with cost-efficient compute and storage and fit-for-purpose query engines like Presto, Presto C++, and Spark, IBM® Db2®, IBM® Netezza® that dynamically scale up and down.

Accelerate generative AI powered data insights Embed a semantic layer powered by IBM® watsonx.ai™ foundation models into watsonx.data to easily find, semantically enrich, and understand previously cryptic structured data in natural language through semantic search and unlock data insights faster Explore foundation models in watsonx.ai

Share a single copy of data for analytics and AI Store vast data in vendor-agnostic open formats, such as Parquet, Avro and Apache ORC, using Apache Iceberg table format and shared metadata to share a single copy of data across multiple query engines. Create vectorized embeddings to prepare your data for AI.

Easy-to-use integrated console Connect to and access a single copy of your data, wherever it resides across a hybrid cloud and deploy query engines in minutes. Explore and transform data with common SQL.

Partner with us to deliver commercial solutions enhanced with watsonx.data to better address clients’ needs
Client stories Clients are using watsonx.data to capitalize on the value of their data across the hybrid cloud and deploy AI workloads at scale. Explore more client stories Cogniware

“With watsonx.data integrated into Argos, our platform has significantly powered up, simplifying and enhancing our customer experience remarkably.”

— Dominik Regner, Sales Manager, Cogniware

Read the case study

“IBM watsonx.data enables next-generation lakehouse architecture for data-driven enterprises. We believe watsonx.data capabilities will help enterprises lower storage costs and optimize compute while helping to ensure seamless data management capabilities across discrete systems to support all data engineering and analytics (AI/ML) needs.”

— Ashish Baghel, CEO and Founder, NuoData and NucleusTeq


“Organizations struggle with data accessibility and performance when developing the next generation of robust AI and ML models. By working with watsonx.data, we accelerate our clients’ connection to their data, whether on-premises or at the edge, so they can gain trusted insights quickly by accessing all their data across their hybrid cloud environments.”

— Chris Cochran, VP Alliances, WANdisco

watsonx.data and AWS
Available now

IBM watsonx.data and AWS are enhancing cloud-based analytics and AI, enabling organizations to accelerate their data modernization strategies. By combining the openness, performance and governance of IBM watsonx.data with the scalability, agility and cost efficiency of the AWS cloud infrastructure, businesses can achieve greater convenience and flexibility.

Read the blog
Take the next step

Get started with a free trial or request a live demo to see how you can put watsonx.data to work today.

Start your free trial Book a live demo
More ways to explore Become an IBM Business Partner Connect with the IBM Community Explore AI consulting with IBM Explore the interactive demo Generative AI learning series SaaS documentation See product updates Software documentation Subscribe to AI topic updates Support

When comparing published 2023 list prices normalized for VPC hours of watsonx.data to several major cloud data warehouse vendors. Savings might vary depending on configurations, workloads and vendor.

IBM statements regarding its plans, directions and intent are subject to change or withdrawal without notice at its sole discretion. See Pricing for more detail. Unless otherwise specified under Software pricing, all features, capabilities and potential updates refer exclusively to SaaS. IBM makes no representation that SaaS and software features and capabilities will be the same.