InfoSphere Information Server on Hadoop

InfoSphere® Information Server provides tools that you can use to transform and cleanse big data by using the resource management capabilities of Hadoop to run jobs on the Hadoop cluster. To run InfoSphere Information Server on Hadoop, configure your Hadoop environment, install InfoSphere Information Server on a Hadoop cluster, and configure your installation to work with Hadoop.

Overview
You can profile, validate, cleanse, transform, and integrate your big data on Hadoop, an open source framework that can manage large volumes of structured and unstructured data

Preparing Hadoop Before you install InfoSphere® Information Server, you must have an existing Hadoop cluster or you must install a new cluster.

Configuring Hadoop
You must adjust your Hadoop configuration settings to integrate your Hadoop installation with InfoSphere Information Server.

Installing InfoSphere Information Server After you install and configure your Hadoop cluster, you can install InfoSphere Information Server.

Configuring InfoSphere Information Server to run on Hadoop
After you install InfoSphere Information Server, you set up users and configure environment variables and files.

Integrating InfoSphere Information Server on Hadoop with Apache Ambari
You can also use the Apache Ambari management tool to configure InfoSphere Information Server.

Troubleshooting InfoSphere Information Server on Hadoop Use the information in this section to help you understand, isolate, and resolve issues with InfoSphere® Information Server on Hadoop.