The Power of One: IBM + Hortonworks drives advanced analytics

IBM® Db2 Big SQL Sandbox trial

IBM and Hortonworks® have partnered to bring you the future of data science. 

Sign up and easily experience and explore IBM Db2 Big SQL features on the Hortonworks Hadoop Platform. 

What is Big Data Analytics?

Artificial Intelligence (AI), mobile, social and Internet of Things (IoT) are driving data complexity, new forms and sources of data. Big data analytics is the use of advanced analytic techniques against very large, diverse data sets that include structured, semi-structured and unstructured data, from different sources, and in different sizes from terabytes to zettabytes.

Big data is a term applied to data sets whose size or type is beyond the ability of traditional relational databases to capture, manage, and process the data with low-latency. And it has one or more of the following characteristics – high volume, high velocity, or high variety. Big data comes from sensors, devices, video/audio, networks, log files, transactional applications, web, and social media - much of it generated in real time and in a very large scale.

Analyzing big data allows analysts, researchers, and business users to make better and faster decisions using data that was previously inaccessible or unusable. Using advanced analytics techniques such as text analytics, machine learning, predictive analytics, data mining, statistics, and natural language processing, businesses can analyze previously untapped data sources independent or together with their existing enterprise data to gain new insights resulting in better and faster decisions.

Use cases

Data Lake Analytics

A data lake is a shared data environment that comprises multiple repositories and capitalizes on big data technologies. It provides data to an organization for a variety of analytics processes.

IBM Data Science experience

Data Science is an interdisciplinary field that combines machine learning, statistics, advanced analysis, and programming. It is a new form of art that draws out hidden insights and puts data to work in the cognitive era.

Apache™ Hadoop®

IBM has partnered with Hortonworks to provide an enterprise grade distribution of Apache Hadoop. This highly scalable storage platform was designed to process very large data sets across hundreds to thousands of computing nodes operating in  parallel. It provides a cost-effective storage solution for large data volumes with no format requirements.



Deliver easy data querying across the enterprise with this hybrid SQL engine for Hadoop. Connect or query from disparate data sources such as HDFS, RDMS, NoSQL databases, object stores and WebHDFS. Enjoy low latency, support for ad-hoc and complex queries, high performance, security, SQL compatibility and federation capabilities to get the most from your data warehouse and SQL on Hadoop.

IBM Big Replicate

Provides enterprise class replication for Apache™ Hadoop® and object store by delivering continuous availability, performance and guaranteed data consistency. Big data is replicated from the lab to production, from production to disaster recovery sites or, from ground to cloud object stores governed by the most demanding business and regulatory requirements.

IBM Analytics for Apache™ Spark™

Increase your analytic agility with the power of open source. Process large data volumes at great speed in a hosted, managed, secure environment.


The Data Warehouse Evolved: A Foundation for Analytical Excellence

ReExplore a Best-in-Class approach to data management and how companies are prioritizing data technologies to drive growth and efficiency.

Build a better data lake

Learn how a data lake can help your organization capitalize on a broader variety of data and reach smarter, data-driven decisions.

Making Sense of Big Data

A Day in the Life of an Enterprise Architect.

Engage with an expert

Schedule a one-on-one call with an expert to learn about the IBM Hortonworks relationship and how we can help you extend data science and machine learning across the Apache Hadoop ecosystem.