Process your data with Apache Pig

Get the info you need from big data sets with Apache Pig

From the developerWorks archives

M. Tim Jones

Date archived: April 22, 2019 | First published: February 28, 2012

Apache Pig is a high-level procedural language for querying large semi-structured data sets using Hadoop and the MapReduce Platform. Pig simplifies the use of Hadoop by allowing SQL-like queries to a distributed dataset. Explore the language behind Pig and discover its use in a simple Hadoop cluster.

This content is no longer being updated or maintained. The full article is provided "as is" in a PDF file. Given the rapid evolution of technology, some content, steps, or illustrations may have changed.

Zone=Linux, Open source
ArticleTitle=Process your data with Apache Pig