Digital illustration of two threads of colorful icons weaving together from left to right, getting narrower

Introducing GPU Acceleration for IBM watsonx.data: Private Technical Preview

Designed to help enterprises tackle one of their most pressing challenges: the rising cost of analytics and AI at scale.

At NVIDIA GTC, IBM, Nvidia and Nestlé showcased what’s possible when GPU acceleration meets open data architecture—demonstrating up to 5x faster analytics workloads while reducing costs by as much as 83%. It was a clear signal of what the future of data-intensive workloads can look like when performance and efficiency are rethought together.

Today, IBM is building on that momentum with the official announcement of the private technical preview of GPU-accelerated query processing for watsonx.data—designed to help enterprises tackle one of their most pressing challenges: the rising cost of analytics and AI at scale.

As organizations continue to expand AI and analytics across increasingly complex, distributed data environments, traditional CPU-based systems are struggling to keep pace. The result is slower insights, escalating infrastructure costs and growing pressure on data teams to deliver more value with fewer resources. GPU acceleration represents a new path forward.

Proven impact: Redefining cost and performance at scale

Early results from private technical preview clients demonstrate the real-world potential of this approach. 

These results are not incremental improvements. They signal a fundamental shift in how organizations can approach data-intensive workloads—where performance gains and cost reductions are no longer a tradeoff, but a combined outcome. As data volumes grow and AI adoption accelerates, this kind of step-change improvement is quickly moving from competitive advantage to business necessity.

IBM has also applied this innovation internally through its CIO organization, acting as “Client Zero” for GPU-accelerated watsonx.data. In early deployments, running IBM watsonx.data Presto C++ with GPU acceleration - compared to CPU-only execution IBM teams have achieved up to 25x faster query performance 1 while reducing costs by approximately 80% for this workload due to reduced runtime 2. These results demonstrate not only the scalability of the approach, but also IBM’s commitment to validating the technology in real-world enterprise environments before bringing it to clients. These results are derived from IBM testing of telemetry query workloads on Nvidia A100 GPU infrastructure, results may vary by query, data volume, infrastructure, configuration and conditions.*

Rewriting the economics of analytics and AI

At the core of this innovation is a simple but powerful shift: moving compute-intensive query operations—such as joins, aggregations and filtering—from CPUs to GPUs. By leveraging massively parallel processing analytical workloads can execute drastically faster than traditional approaches, while significantly reducing infrastructure usage and cost.

This is more than a performance boost; it’s a structural change in how analytics economics work. Faster queries mean faster decisions. Lower compute requirements translate directly into reduced cost per query. And with more efficient resource utilization, organizations can scale analytics and AI workloads without linear cost increases.

Enterprise-ready performance without disruption

One of the biggest barriers to modernization has always been complexity. Many performance solutions require rewriting SQL, migrating data or rearchitecting pipelines. With GPU acceleration that tradeoff disappears.

This capability is fully transparent to users—existing queries, data formats and connectors continue to work as-is. Organizations can immediately benefit from faster query execution, higher concurrency and lower latency without disrupting existing workflows. This makes it uniquely suited for enterprises balancing innovation with operational stability.

Just as importantly, the solution is built for hybrid environments, integrating seamlessly across cloud and on-premises deployments while maintaining open-source Presto compatibility and enterprise-grade governance.

Expanding access to clients

We are now expanding access to this capability through our private technical preview program, giving select clients the opportunity to collaborate directly with IBM product and engineering teams, influence the roadmap and gain early access to next-generation analytics performance.

If your organization is running large-scale analytics, exploring AI-driven workloads or looking to improve price-performance without disruption, we invite you to join us.

Participate in the IBM watsonx.data GPU acceleration private technical preview and help shape the future of enterprise analytics:

Sign up today

Margo Harrell

Product Marketing, IBM watsonx.data

Hemant Suri

Program Director

IBM watsonx.data

Footnotes

1 *Based on internal IBM testing on telemetry workloads comparing a GPU-accelerated infrastructure consisting of an on-prem system that uses a single server with 8 Nvidia A100 GPUs to CPU’s on watsonx.data SaaS with Presto running on IBM Cloud. On the full dataset of over approximately 1.159 billion rows, telemetry queries were completed in 1.77 minutes using 4 GPUs, compared to a prior baseline of approximately 45 minutes in the CPU environment. Results are based on tested workloads and configurations. Actual performance will vary depending on query type, data volume, infrastructure, cluster configuration, and operating conditions.

2 *Based on internal IBM testing on telemetry workloads comparing a GPU-accelerated infrastructure consisting of an on-prem system that uses a single server with 8 Nvidia A100 GPUs to CPU’s on watsonx.data SaaS with Presto running on IBM Cloud. On the full dataset of over approximately 1.159 billion rows, telemetry queries were completed in 1.77 minutes using 4 GPUs, compared to a prior baseline of approximately 45 minutes in the CPU environment. CPU costs were based on IBM published instance cost per query using the watsonx.data SaaS instance setup in IBM Cloud. GPU cost per query of the tested on-prem system of watsonx.data running on GPU was not directly available and was estimated using an assumption that GPU hourly cost is 2x CPU hourly cost. Cost calculations use runtime as the primary variable. Based on this methodology, total cost was estimated to decrease to approximately 15–20% of prior cost, implying a possible 80-85% savings.  Actual costs will vary depending on pricing, workload profile, infrastructure configuration, utilization, and operating conditions.