By: Adalberto Medeiros.
For those who want to use Hadoop in a PowerLinux cluster to process large sets of data, this tutorial helps building the last Hadoop release from Community (1.0.3) and how to install it.
More information on Hadoop:
From the Apache web page, Hadoop is defined as "a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures."