Today, IBM launched the evolution of IBM watsonx.data, the only hybrid, open data lakehouse for enterprise AI and analytics, for general availability.
Organizations can now simplify and scale the access, preparation and delivery of unstructured and structured data to power more accurate, relevant gen AI applications, scale self-service analytics and simplify and scale previously complex data access, enrichment and governance.
Enterprise data is the best tool to power accurate, differentiated AI that is relevant to your industry and your clients and drives competitive advantage. However, 90% of enterprise data is unstructured data, which has largely remained inaccessible and underutilized for gen AI.
Now you can access, prepare and deliver your enterprise unstructured data to power 40% more accurate AI than conventional RAG with IBM watsonx.data.1 Watsonx.data is uniquely:
Now you can scale and automate:
All of this can be done within IBM watsonx.data to unlock enterprise unstructured data for AI and traditional analytics, such as data engineering, BI and ML.
IBM watsonx.data now offers Apache Gluten accelerated Spark as one of its multiple fit-for-purpose query engines, significantly boosting performance for compute-intensive Spark SQL workloads. Apache Gluten, a high-performance library, optimizes Apache Spark SQL workloads by offloading execution to Velox, a native C++ execution engine. This integration delivers faster query processing and enhanced resource efficiency for large-scale data analytics. Now organizations can execute complex analytical tasks with even greater speed and scalability and at lower costs.
IBM recently acquired DataStax, bringing a NoSQL operational vector datastore, built on Apache Cassandra, to watsonx.data. This addition to watsonx.data enhances our vector capabilities and strengthens our retrieval-augmented generation and knowledge embedding capabilities.
DataStax is optimized for read and write gen AI applications and operational workloads that demand real-time performance, high availability and scale- bringing organizations the speed, reliability and multi-modal support needed for modern AI applications.
DataStax also seamlessly connects with Langflow, soon to be available as part of IBM watsonx.ai. Langflow is an open-source tool with over 60,000 GitHub stars, that enables developers to prototype, build and deploy retrieval-augmented generation and multi-agent AI applications through an intuitive low-code interface to reduce development friction and accelerate time to value.
We announced the closed preview of these capabilities at Think 2025, while sharing the stage with distinguished guest speakers across the Data keynote session, spotlight sessions, and techbyte demos, who are paving the way for data and AI innovation in their industries.
Lockheed Martin joined the keynote stage with Meta. Lockheed recently leveraged the transformed watsonx.data, enabling 70,000 engineers, scientists and technicians to retrieve answers and information from millions of documents using natural language. "We are rapidly accelerating our innovation and efficiency, to get solutions out of the lab and into the field, helping create a safer, more secure world," says John Clark, senior vice president of Technology and Strategic Innovation at Lockheed.
EY recently debuted groundbreaking AI-powered Global Tax Compliance Solutions that address the largest challenges facing tax departments, built with watsonx. “EY delivers tax services in over 150 countries, and almost universally in those countries, our clients struggle with data,” says Christopher Aiken, Americas Indirect Tax AI Leader at EY. “watsonx has cut down our human effort for data cleansing, enrichment and quality review by 30 - 50%.”
USAA is leveraging GenAI to drive the future of insurance and improve customer experience. “In the insurance industry, we deal with a significant amount of unstructured data,” says Ramnik Bajaj, Chief Data Analytics & AI Officer at USAA. “For instance, home inspection reports, police reports and accident images contain very little structured data. With gen AI, we have the opportunity to extract key attributes and insights from this unstructured data, making it much more accessible and useful for underwriters, adjusters and service representatives.”
You can now get started with the evolution watsonx.data as part of the premium edition.
Try a free trial with USD 2000 in free credits
1 Based on internal testing comparing the answer correctness of AI model outputs using watsonx.data Premium Edition retrieval layer to vector-only RAG on three common use cases with IBM proprietary datasets using the same set of selected opensource commodity inferencing, judging and embedding models and additional variables. Results can vary.