Now available, a fit-for-purpose data store built on an open data lakehouse architecture to scale AI workloads, for all your data, anywhere

Start your free trial Explore the interactive demo
An open, hybrid and governed data store

IBM® watsonx.data™ data enables you to scale analytics and AI with all your data, wherever it resides, through: 

  • Open formats to access all your data through a single point of entry and share a single copy of data across your organization and workloads, without needing to migrate or recatalog, reducing ETL and data duplication.

  • Integrated vectorized embedding capabilities to prepare your data for Retrieval Augment Generation (RAG) or other machine learning and generative AI (gen AI) use cases. 

  • Gen AI-powered conversational interface to easily find, augment and visualize data and unlock new data insights—no SQL required (in tech preview).

  • Integration with existing databases, tools and modern data stacks.

  • Hybrid deployment options as fully managed SaaS on IBM Cloud® and AWS or self-managed containerized software on premises.

IBM watsonx.data's integrated vector database

Unify, curate and prepare vectorized embeddings for your gen AI applications at scale across your trusted, governed data

Live demo

Book a meeting with watsonx.data product specialists

Get the ebook: The data store for AI

Use cases

Explore the ways to put your data to work. 

Cost-optimize your data warehouse Optimize workloads from your data warehouse by choosing the right engine for the right workload, at the right cost. Replace ETL jobs and reduce costs of your data warehouse by up to 50% through workload optimization. Read more about data warehouse optimization Watch the webinar

Modernize your data lake Extract more value from your data by modernizing ineffective data lakes with warehouse-like performance, security and governance. Read more about data lake modernization

Data store for generative AI Unify, curate and prepare data efficiently for AI models and applications. Integrated vectorized embedding capabilities enable RAG use cases at scale across large sets of your trusted, governed data.  Read more about preparing and delivering your data for AI

Generative AI-powered data insights Leverage generative AI infused in watsonx.data to easily find, augment and visualize data and unlock new data insights by way of a conversational interface—no SQL required. Unleash cryptic structured data for AI using auto-generated semantic metadata in natural language for easy self service access to data (coming soon).

Streamlined data engineering and data virtualization Reduce data pipelines and simplify data transformation. Query data in one place with data virtualization with Presto, which has over 35 connectors to various external database vendors.

Real-time analytics and business intelligence Connect existing data with new data in minutes and unlock new insights without the cost and complexity of duplicating and moving data. Integrate with IBM® Cognos® and other third-party business intelligence and dashboard tools to visualize data and insights.

Benefits Scale AI workloads for all your data, anywhere. Download the watsonx.data solution brief Access your data across hybrid cloud

Connect to storage and analytics environments in minutes and access all your data through a single point of entry with a shared metadata layer across clouds and on-premises environments.

Leverage quality data for AI

Unify, curate and prepare data efficiently for AI models and applications of your choice. Empower your AI with your trusted data.

Reduce your data warehouse cost

With workload optimization across multiple query engines and storage tiers, minimize the cost of your data warehouse by 50% and pair the right workload with the right engine.1


Key features to access and share data across hybrid cloud.

Fit-for-purpose query engines Provide fast, reliable and efficient big data processing at scale through multiple engines, such as Presto and Spark. Read the Presto white paper Get the ebook: Learning and operating Presto

Built-in governance, security and automation Ensure enterprise compliance and security using built-in unified governance or connect to existing solutions.

Vendor-agnostic data formats Use vendor-agnostic open formats for analytic data sets, including Apache Iceberg table format and Apache Hive metastore. This enables different engines to access and share data simultaneously.

Cost-effective, simple object storage Store large volumes of data in low-cost object storage and share it through an open table format built for high-performance analytics.

Integrated vector database Use your trusted and governed data in RAG and other machine learning and AI use cases. Access, catalog and create vectorized embeddings to empower your AI with your data.

Hybrid cloud deployments Seamlessly deploy across any cloud or on-premises environment in minutes with workload portability through Red Hat® OpenShift®. Accelerate on-premises deployment and querying through integration with IBM® Storage Fusion HCI. Explore IBM Storage Fusion HCI


Combine the performance of data warehouses with the flexibility of data lakes to address the challenges of today’s complex data landscape and scale AI. Optimize workloads from your data warehouse by choosing the right engine for the right workload at the right cost, and modernize your ineffective data lakes with warehouse-like performance, security and governance. 

Optimize workloads with fit-for-purpose query engines Drive analytics costs down with cost-efficient compute and storage and fit-for-purpose analytics engines—Presto, Spark, IBM® Db2®, IBM® Netezza®—that dynamically scale up and down.

Insights powered by generative AI coming soon Watsonx.data leverages IBM® watsonx.ai™ foundation models to simplify and accelerate the way users interact with data. You can use natural language to explore, augment and enrich data from a conversational user interface. Explore foundation models in watsonx.ai

Share a single copy of data for analytics and AI Store vast data in vendor-agnostic open formats, such as Parquet, Avro and Apache ORC, leveraging Apache Iceberg table format and shared metadata to share a single copy of data across multiple query engines. Create vectorized embeddings to prepare your data for AI.

Easy-to-use integrated console Connect to your existing analytics data across hybrid cloud and deploy query engines in minutes. Explore and transform data with common SQL.

Partner with us to deliver commercial solutions enhanced with watsonx.data to better address clients’ needs.
Client stories Clients are using watsonx.data to capitalize on the value of their data across the hybrid cloud and deploy AI workloads at scale. Explore more client stories Cogniware

“With watsonx.data integrated into Argos, our platform has significantly powered up, simplifying and enhancing our customer experience remarkably.”

— Dominik Regner, Sales Manager, Cogniware

Read the case study

“IBM watsonx.data enables next-generation lakehouse architecture for data-driven enterprises. We believe watsonx.data capabilities will help enterprises lower storage costs and optimize compute while ensuring seamless data management capabilities across discrete systems to support all data engineering and analytics (AI/ML) needs.”

— Ashish Baghel, CEO and Founder, NuoData and NucleusTeq


“Organizations struggle with data accessibility and performance when developing the next generation of robust AI and ML models. By working with watsonx.data, we accelerate our clients’ connection to their data, whether on premises or at the edge, so they can gain trusted insights quickly by accessing all their data across their hybrid cloud environments.”

— Chris Cochran, VP Alliances, WANdisco

watsonx.data and AWS
Available now

IBM watsonx.data and AWS are enhancing cloud-based analytics and AI, enabling organizations to accelerate their data modernization strategies. By combining the openness, performance and governance of IBM watsonx.data with the scalability, agility and cost efficiency of the AWS cloud infrastructure, businesses can achieve greater convenience and flexibility.

Read the blog
Take the next step

Get started with a free trial or request a live demo to see how you can put watsonx.data to work today.

Start your free trial Book a live demo
More ways to explore Become an IBM Business Partner Connect with the IBM Community Explore AI consulting with IBM Explore the interactive demo Generative AI learning series SaaS documentation See product updates Software documentation Subscribe to AI topic updates Support

When comparing published 2023 list prices normalized for VPC hours of watsonx.data to several major cloud data warehouse vendors. Savings may vary depending on configurations, workloads and vendor.

IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM’s sole discretion. See Pricing for more detail. Unless otherwise specified under Software pricing, all features, capabilities, and potential updates refer exclusively to SaaS. IBM makes no representation that SaaS and software features and capabilities will be the same.