Apache Hive

Mine data quicker with SQL intellect

Apache Hadoop is scalable and cost-effective when processing large data volumes, so it's a popular open-source framework for many organizations.

Unfortunately, enterprises with traditional data warehouses are usually based on Structured Query Language (SQL), making the Apache Hadoop ecosystem difficult for users and developers who have relied on SQL queries to extract data.

With Apache Hive's SQL intellect, you can write SQL-like queries called hive query language (HQL) to extract data from Hadoop without learning another language, taking advantage of Hadoop's power to save you time and money.

Apply structure to unstructured data

Apache Hive helps you read, write and manage large datasets from distributed storage with speed and SQL ease.

Gain insights faster and better

Because Hive is built on top of Hadoop, it offers the same speed, flexibility and scalability that Hadoop is known for.

Use the programming language you know

Using an SQL-like interface, developers can write HQL statements similar to standard SQL statements for data query and analysis, saving them the need to learn a new programming language.

Understand details about your data

Get to the bottom of data by extracting and viewing it in various dimensions-consolidated, drilled-down or sliced and diced-with online analytical processing (OLAP) capabilities.

Apache solutions from IBM

Drive better, faster analytics with Apache through IBM analytics solutions.

IBM Db2® Big SQL

IBM Db2 Big SQL is a hybrid SQL engine for Apache Hadoop. It can concurrently exploit Hive, HBase and Spark using a single database connection or query.

Explore Db2 Big SQL

IBM Open Data Analytics for z/OS

Optimize open-source technologies runtime-including Apache Spark, Anaconda and Python-and gain insights from data at its source.

Explore Open Data Analytics for z/OS

IBM InfoSphere® Classic Federation Server for z/OS

Access and integrate diverse data and content sources as a single resource regardless of where the information resides.

Explore IBM InfoSphere Classic Federation Server for z/OS

IBM and Cloudera

IBM and Cloudera have partnered to offer an industry-leading, enterprise-grade Hadoop distribution, including an integrated ecosystem of products and services to support faster analytics at scale.

See how IBM and Cloudera deliver better big data solutions

See Apache Hive in action

Access Hive data faster and more securely with Db2 Big SQL. See results from 1 TB and 10 TB performance tests and highlights of security benefits.

Watch the video (11:10)

Related resources

Apache Spark

Apache Spark is a lightning-fast, open-source data-processing engine for machine learning and AI applications, backed by the largest open source community in big data. IBM analytics services for Apache Spark give you the power of a hassle-free Spark experience with integrated Jupyter Notebooks for faster integration and answers.

The Data Warehouse Evolved

Explore a best-in-class approach to data management and how companies are prioritizing data technologies to drive growth and efficiency.

Understanding big data beyond the hype

Read this practical introduction to the next generation of data architectures. It introduces the role of the cloud and NoSQL technologies and discusses the practicalities of security, privacy and governance. (PDF 6.5 MB)

Engage with an expert

Schedule a no-cost, one-on-one call with an IBM big data expert to learn how to extend data science and machine learning across the Apache Hadoop ecosystem.

Get connected

Apache Hadoop community

Explore Apache Hadoop

Cognitive class