What is Hadoop?

Apache™ Hadoop® is an open source software project that enables distributed processing of large structured, semi-structured, and unstructured data sets across clusters of commodity servers. It is designed to scale up from a single server to thousands of machines, with a very high degree of fault tolerance.

IBM's Open Platform (IOP) is a free, open source distribution of Hadoop. It complies with technology standards that IBM and many of our competitors have agreed to, so that our clients will be able to work seamlessly with most open-source offerings.

For clients looking for more capability, IBM offers BigInsights. BigInsights provides both IOP and valuable tools for security, analytics, and system administration.

IBM BigData Tools

Deploy Anywhere

Schema-less structure enables Hadoop to absorb and aggregate data (structured or not) from a number of sources

Hadoop Scalability

Scalable

Add capacity without needing to change data formats, how data is loaded, or how jobs are written, or the applications on top

Advanced tools with Apacke Hadoop

Advanced tools

Access a suite of advanced data science tools for visualization, machine learning, text analysis, and others in the Apache Hadoop ecosystem

Cost effectiveness Apache Hadoop

Cost effective

Access a suite of advanced tools for visualization, machine learning, text analysis, and others in the Apache Hadoop ecosystem

 
Get the tools to help you visualize Hadoop

Analyze and visualize data

With IBM BigInsights Analyst, you get tools to access data with SQL queries. You can also analyze and visualize data in a familiar spreadsheet-style analytic interface, with a user-friendly management console.

Learn more about IBM BigInsights for Apache Hadoop

See it in action


Try Hadoop

Analyst tools

Big SQL An SQL language processor for summarizing, querying, and analyzing data in a Hadoop distributed file system
BigSheets A spreadsheet-like visualization tool to model, filter, combine, and chart data
Explore Hadoop Today
Explore Hadoop

Powerful tools for data science

With IBM BigInsights Data Scientist, you get powerful tools to answer "what if" questions, discover patterns in data, and generate deeper insights faster.

Learn more about IBM BigInsights for Apache Hadoop

See it in action


Try Hadoop

Data scientist tools

SystemML Machine learning algorithms optimized for Hadoop
Big R A tool that enables R users to execute R models across a Hadoop cluster
Text Analytics A user interface and engine for extracting structured information from unstructured and semi-structured text
Big SQL An SQL language processor for summarizing, querying, and analyzing data in a Hadoop distributed file system
BigSheets A spreadsheet-like visualization tool to model, filter, combine, and chart data
Explore Hadoop Today
Learn more about Hadoop in the Cloud
Learn more about Hadoop in the IBM Cloud

Available in the Cloud

IBM BigInsights on Cloud offers the performance and security of an on-premises deployment without the cost or complexity of managing your own infrastructure.

Learn more about IBM BigInsights on Cloud

Try it

Hadoop on IBM BigInsights on Cloud features

  • Managed operations provide 24 x 7 monitoring
  • IBM Open Platform with current Apache Hadoop components
  • IBM BigInsights Analyst and BigInsights Data Scientist capabilities optional
  • Dedicated bare metal nodes for enhanced performance, data privacy, and security
Get started with a Hadoop trial

Need help getting started with Hadoop on IBM BigInsights?

IBM provides the industry's premier open Hadoop solution that delivers critical insights with flexible deployment options. To help clients decide how to get started with this technology, we offer a service called Stampede. Stampede is designed to help clients determine their information requirements and evaluate the right technology approach to meeting those needs. The end result is greater clarity in how to move forward and a faster time to value!

Hadoop Resources

Access analyst reports, data sheets, white papers and more.

Hadoop in the Cloud

Hadoop in the cloud

Leverage big data analytics easily and cost-effectively with IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop

IBM BigInsights for Apache Hadoop

Efficiently manage and mine big data for valuable insights.

IBM BigInsights for Apache Hadoop

Using IBM BigInsights to accelerate big data time-to-value

Discover how Hadoop innovation, plus unique capabilities, bring lower costs and faster time to value.

Client success stories

 

E&Y client story

Ernst and Young

Ernst & Young uses big data and analytics to combat fraud and mitigate risk for its customers.

Watch client story

Optibus success with Hadoop

Optibus

Optibus enables smarter public transport systems through real-time analytics of in-motion data.

Read client story

Teikoku success with Hadoop

Teikoku Databank, Ltd

Teikoku Databank shortens the time to process billions of textual data items from several days to 30 minutes.

Read client story