25 February 2025
Uniting DataStax’s NoSQL expertise and cutting-edge data and AI technologies with IBM’s watsonx enterprise AI stack to unlock innovation at scale.
Today IBM is announcing its intent to acquire DataStax, helping enterprises harness crucial enterprise data to maximize the value of generative AI at scale.
DataStax is the creator of essential technologies for this mission: including AstraDB and DataStax Enterprise, the NoSQL and vector databases powered by Apache Cassandra as well as Langflow, the open-source tool and community for low-code AI application development.
Data-driven enterprise applications need to deliver scalable, always-on performance, with flexibility, security, and streamlined developer productivity. Apache Cassandra® is the NoSQL database of choice for many organizations because of these enduring qualities and the continuous innovation in the open-source community. That’s why businesses like FedEx, Capital One, and Verizon are working with DataStax and Apache Cassandra to future-proof themselves with a resilient, distributed architecture capable of handling massive data volumes and high-velocity traffic without compromising performance.
At IBM, we believe in the power of open source and the innovation driven by community collaboration. This is why open-source software has been the cornerstone of our watsonx AI portfolio—incorporating technologies like Iceberg, Spark, Velox, Presto and more. We share this belief with DataStax, who has continued to expand their stewardship of open source including Apache Cassandra®, Langflow, OpenSearch, and more.
As a part of our shared commitment to open-source excellence, we look forward to supporting and innovating alongside these communities, helping shape the future of enterprise data management and AI-driven solutions.
Most organizations have plenty of data—sometimes our clients say they are drowning in it. But that data is frequently unstructured and spread across disparate programs, environments and teams. Unstructured data represents a treasure trove of untapped business intelligence, representing 93% of all enterprise data in 2024, according to IDC. Harnessing the power of this data within generative AI applications is essential. But to do that, enterprises must first make order out of data chaos.
The strategic acquisition of DataStax brings cutting-edge capabilities in managing unstructured and semi-structured data to watsonx, building on open-source Cassandra investments for enterprise applications and enabling clients to modernize and develop next generation AI applications. The data infrastructure required for AI is much more than just “vector.” Many modalities of data—JSON, time-series, key/value, tabular, graph—need to come together to make the data ingest and search accurate and relevant. By having them built into a simplified and scalable solution (thanks to generative AI) users don't have to stitch together a multitude of data representations to gain value from their enterprise data.
Today's methods for preparing unstructured data for AI remain limited. Traditional RAG relies on a manual process that demands numerous iterations to improve results with limited accuracy and relevancy. In contrast, emerging techniques—such as multi-modal RAG (e.g., Graph RAG and SQL RAG)—integrate derivative representations of unstructured knowledge, delivering more accurate search results and relevancy. By capturing relationships, metadata, hierarchies, and connections that pure semantic embeddings overlook, these approaches enable more enterprise ready, performant, relevant and efficient solutions.
Watsonx has been on a mission to simplify data for AI applications. With DataStax’s AstraDB and DataStax Enterprise, watsonx will leverage robust capabilities of NoSQL and advanced vector representations. Our combined technology will capture richer, more nuanced representations of knowledge, ultimately leading to more efficient and accurate outcomes. By harnessing DataStax's expertise in managing large-scale, unstructured data and combining it with watsonx's innovative data AI solutions, we will provide enterprise ready data for AI with better data performance, search relevancy, and overall operational efficiency.
The essential components of the next generation AI stack are AI applications powered by middleware, unstructured and semi-structured databases, and models. DataStax is the parent company of Langflow, an open-source project that delivers orchestration for AI applications, complementing their vector store. Bringing these capabilities alongside the strength of our watsonx portfolio gives IBM the complete stack to power the next era of generative AI for business.
Langflow empowers developers to rapidly prototype, build, and deploy RAG and multi-agent AI applications with simplicity. Langflow is Python-based and model-, API-, and database-agnostic. By providing a low-code interface, it simplifies the complex integration of generative AI models, data processing, and AI workflows, enabling developers to focus on creating intelligent generative AI applications rather than managing technical integrations and complexity associated with building AI applications.
Tens of thousands of developers trust Langflow for their AI development needs, earning over 49,000 stars on GitHub. Paired with Astra DB—the near-zero latency database with powerful vector and knowledge graph capabilities—and IBM’s watsonx.data, LangFlow provides one of the fastest routes to building enterprise-ready generative AI applications.
We are immensely excited about the value that our combined technologies can bring to our clients and the opportunity we have to continue advancing open-source excellence and innovation across critical areas in data and AI. We are deeply committed to DataStax customers, ensuring they continue to have scalable, always-on access to their most critical data workloads while preparing them for AI. We also remain dedicated to the open-source community and collaborating closely on projects like Apache Cassandra®, Langflow, and OpenSearch—to drive continuous innovation and shared success.
Together with DataStax and these communities, we will continue to drive cutting-edge capabilities into our watsonx portfolio—with a focus on helping our clients accelerate the adoption of AI that is open, trusted and built for business.
To hear further perspective from Chet Kapoor, Chairman and CEO of DataStax on how this move will help our clients unlock the full potential of their data and AI workloads, check out his blog post here: https://www.datastax.com/blog/ibm-plans-to-acquire-datastax
Subject to close of the transaction and regulatory approval