IBM BigInsights

Installing the BigInsights value-add packages

After you have prepared the environment and acquired the software, follow these steps to install value added services.

Before you begin

  • Ensure that you installed a Apache Hadoop platform, such as IBM® Open Platform with Apache Hadoop.
  • Ensure that you followed the steps in Preparing to install the BigInsights value-add services.
  • You must have acquired the software from Passport Advantage download website.
  • When you install any of the BigInsights value-added services as the non-root user, preface the instructions with sudo, where the instruction would normally require the root user.

About this task

The acquired software has a *.bin extension. The name of the *.bin file depends on the module that you download:
IBM BigInsights Analyst
Operating system File name
Red Hat Enterprise Linux on x86-64 (RHEL) bana-X.X.X.X.el.x86_64.bin
Red Hat Enterprise Linux on Power Systems bana-X.X.X.X.el7.ppc64le.bin
SUSE Linux Enterprise Server bana-x.x.x.x.sles11.x86_64.bin
IBM BigInsights Data Scientist
Operating system File name
Red Hat Enterprise Linux on x86-64 (RHEL) bds-X.X.X.X.el.x86_64.bin
Red Hat Enterprise Linux on Power Systems bds-X.X.X.X.el7.ppc64le.bin
SUSE Linux Enterprise Server bds-x.x.x.x.sles11.x86_64.bin
IBM BigInsights Apache Hadoop
Operating system File name
Red Hat Enterprise Linux on x86-64 (RHEL) bah-X.X.X.X.el.x86_64.bin
Red Hat Enterprise Linux on Power Systems bah-X.X.X.X.el7.ppc64le.bin
SUSE Linux Enterprise Server bah-x.x.x.x.sles11.x86_64.bin
IBM BigInsights for Apache Hadoop (Non-Production environment)
Operating system File name
Red Hat Enterprise Linux on x86-64 (RHEL) bahnpe-X.X.X.X.el.x86_64.bin
Red Hat Enterprise Linux on Power Systems bahnpe-X.X.X.X.el7.ppc64le.bin
SUSE Linux Enterprise Server bahnpe-x.x.x.x.sles11.x86_64.bin

When you run the *.bin file, configuration files are copied to appropriate locations to enable Ambari to see that value-add services as available. When adding the value-add services through Ambari, additional software packages can be downloaded. If the Hadoop cluster cannot directly access the internet, a local mirror repository can be created.

Where you perform the following steps depends on whether the Hadoop cluster has direct internet access.
  • If the Hadoop cluster has direct access to the internet, perform the steps from the Ambari server of the Hadoop cluster.
  • If the Hadoop cluster does not have direct internet access, perform the steps from a Linux host with direct internet access. Then, transfer the files, as required, to a local repository mirror.

Procedure

  1. Update the permissions on the downloaded *.bin file to enable execute.
    chmod +x <package_name>.bin
  2. Run the *.bin file to extract and install the services in the module.
    ./<package_name>.bin
    where <package_name> is the module name and version number.
  3. After the prompt, agree to the License terms . Reply yes | y to continue with the install.
  4. After the prompt, choose if you want to install it online (Option 1) or offline (Option 2).
    Option Description
    Option 1: Hadoop cluster has access to the internet.

    The program will lay out the Ambari service configuration files, and update the repository locations in the Ambari server file, repoinfo.xml.

    Skip to step 6.

    Option 2: Hadoop cluster does not have internet access.

    This option initiates a download of files to set up a local repository mirror. A subdirectory called BigInsights will be created and RPMs and the associated files will be located in directory BigInsights/packages/....

    Tip: A local repository is still recommended to avoid multiple downloads of the same software when installing services across multiple nodes. If you plan to set up a local repository, choose Option 2.
    For an example of running the Analyst module for OFFLINE installation, see Example.
  5. Set up a local repository.

    A local repository is required if the Hadoop cluster cannot connect directly to the internet, or if you wish to avoid multiple downloads of the same software when installing services across multiple nodes. In the following steps, the host that performs the repository mirror function is called the repository server. If you do not have an additional Linux host, you can use one of the Hadoop management nodes. The repository server must be accessible over the network by the Hadoop cluster. The repository server requires an HTTP web server. The following instructions describe how to set up a repository server by using a Linux host with an Apache HTTP server.

    1. On the repository server, if the Apache HTTP server is not installed, install it:
      yum install httpd
      or
      zypper install httpd
    2. On the repository server, ensure that the createrepo package is installed.
      yum install createrepo
      or
      zypper install createrepo
    3. Make sure there is network access from all nodes in your cluster to the repository server. If data nodes are on a private network and the repository server is external to the Hadoop cluster, you might need to configure iptables for IP forwarding and masquerading.
    4. On the repository server, create a directory for your value-add repository, such as <mirror web server document root>/repos/valueadds. For example, for Apache httpd, the default is /var/www/html/repos.
      mkdir /var/www/html/repos/valueadds
    5. By selecting Option 2 in step 4, RPMs were downloaded to a subdirectory called BigInsights/packages/BigInsights-Valuepacks/<OS><version>/<platform>/<IOP_version>/.... Copy all of the RPMs to the mirror web server location, <your.mirror.web.server.document root>/repos/valueadds directory.
      cp BigInsights/packages/BigInsights-Valuepacks/RHEL<version>/<platform>/<IOP_version>/*    /var/www/html/repos/valueadds/
    6. Start this web server. If you use Apache httpd, start it by using either of the following commands:
      apachectl start 
      or
      service httpd start

      Ensure that any firewall settings allow inbound HTTP access from your cluster nodes to the mirror web server.

    7. Test your local repository by browsing to the web directory:
      http://<your.mirror.web.server>/repos/valueadds

      You should see all of the files that you copied to the repository server.

    8. On the repository server, run the createrepo command to initialize the repository:
      createrepo /var/www/html/repos/valueadds
    9. In the /var/www/html/repos/valueadds directory, find the RPM to install on the Ambari Server host of the Hadoop cluster:
      BigInsights Analyst
      Operating system File name
      Red Hat Enterprise Linux on x86-64 (RHEL) BI-Analyst-IOP-X.X.X.X.x86_64.rpm
      Red Hat Enterprise Linux on Power Systems BI-Analyst-IOP-X.X.X.X.el7.ppc64le.rpm
      SUSE Linux Enterprise Server  BI-Analyst-IOP-X.X.X.X.SLES11.x86_64.rpm
      BigInsights Data Scientist
      Operating system File name
      Red Hat Enterprise Linux on x86-64 (RHEL) BI-DS-IOP-X.X.X.X.x86_64.rpm
      Red Hat Enterprise Linux on Power Systems BI-DS-IOP-X.X.X.X.el7.ppc64le.rpm
      SUSE Linux Enterprise Server BI-DS-IOP-X.X.X.X.SLES11.x86_64.rpm
      Tip: The BigInsights Data Scientist module also entitles you to the features of the BigInsights Analyst module. Do the yum install for both of the RPM packages to make sure that you get the BigInsights - Home service available for installation..
      BigInsights Apache Hadoop
      Operating system File name
      Red Hat Enterprise Linux on x86-64 (RHEL)
      • BI-Apache-Hadoop-IOP-X.X.X.X.x86_64.rpm
      • BI-Analyst-IOP-X.X.X.X.x86_64.rpm
      Red Hat Enterprise Linux on Power Systems
      • BI-Apache-Hadoop-IOP-X.X.X.X.el7.ppc64le.rpm
      • BI-Analyst-IOP-X.X.X.X.el7.ppc64le.rpm
      SUSE Linux Enterprise Server
      • BI-Apache-Hadoop-IOP-X.X.X.X.SLES11.x86_64.rpm
      • BI-Analyst-IOP-X.X.X.X.SLES11.x86_64.rpm
      BigInsights Apache Hadoop for Non-Production Environment
      Operating system File name
      Red Hat Enterprise Linux on x86-64 (RHEL) BI-Apache-Hadoop-NPE-IOP-X.X.X.X.x86_64.rpm
      Red Hat Enterprise Linux on Power Systems BI-Apache-Hadoop-NPE-IOP-X.X.X.X.el7.ppc64le.rpm
      SUSE Linux Enterprise Server BI-Apache-Hadoop-NPE-IOP-X.X.X.X.SLES11.x86_64.rpm
      Then, copy the file to the Ambari Server host and install the RPMs by using the following commands:
      yum install <BI-xxx-xxx...>.rpm
      or
      zypper install <BI-xxx-xxx...>
    10. On the Ambari Server node, navigate to the /var/lib/ambari-server/resources/stacks/BigInsights/<version_number>/repos/repoinfo.xml file. If the file does not exist, create it. Ensure the <baseurl> element for the BIGINSIGHTS-VALUEPACK <repo> entry points to your repository server. Remember, there might be multiple <repo> sections. Make sure that the URL you tested in step 5.g matches exactly the value indicated in the <baseurl> element.
      For Spectrum Scale (GPFS):

      Navigate to the repoinfo.xml file at /var/lib/ambari-server/resources/stacks/BigInsights/<version_number>.SpectrumScale/repos/repoinfo.xml

      Then manually add the repo entry:
      <repo>
       <baseurl> http://<your.mirror.web.server>/repos/valueadds </baseurl>
       <repoid>BIGINSIGHTS-VALUEPACK.version_number</repoid>
       <reponame>BIGINSIGHTS-VALUEPACK.version_number</reponame>
      </repo>
      HDFS example:
      For example, the repoinfo.xml might look like the following content after you change http://ibm-open-platform.ibm.com/repos/BigInsights-Valuepacks/to become http://your.mirror.web.server/repos/valueadds:
      <repo> 
      <baseurl> http://<your.mirror.web.server>/repos/valueadds 
      </baseurl>
      <repoid>BIGINSIGHTS-VALUEPACK.version_number</repoid> 
      <reponame>BIGINSIGHTS-VALUEPACK.version_number</reponame>
      </repo>
      Note: The new <repo> section might appear as a single line.
      Tip: If you later find an error in this configuration file, make corrections and run the following command:
      yum clean all
      Tip: If you are using a local repository URL and you modify the URL at any time, you must remember to update the baseURL. You can update the repoinfo.xml file, or update the fields on the Ambari web tool. Here are the steps to use the Ambari web tool:
      1. From the Ambari web dashboard, in the menu bar, click admin > Manage Ambari.
        Managing the repositories from the Ambari dashboard.
      2. From the Clusters panel, click Versions > <stack name>.
      3. Change the URL as needed, and click Save.
  6. When the module is installed, restart the Ambari server.
    sudo ambari-server restart
  7. Open the Ambari web interface and log in. The default address is the following URL:
    http://<server-name>:8080
    The default login name is admin and the default password is admin.
  8. Click Actions > Add service. In the list of services you will see the services that you previously added as well as the BigInsights services you can now add.
    Restriction: If your platform is SUSE Linux Enterprise Server (SLES) you cannot install the R service from the Ambari dashboard using the Add Service wizard. Ignore the R install box in the Add Service wizard

Example

This is an example of running the IBM BigInsights Analyst module for OFFLINE installation:
[root@mgt1 ~]# chmod +x BI-Analyst-1.0.0.1-IOP-4.0.x86_64.bin
[root@mgt1 ~]# ./BI-Analyst-1.0.0.1-IOP-4.0.x86_64.bin
Creating directory ./BigInsights
Verifying archive integrity... All good.
Uncompressing IBM BigInsights Analyst Package Installer   100%
************************************************************************ 
License Files
************************************************************************
License files are available in the /root/BigInsights/licenses
Do you Accept the Terms and Conditions in the Licenses Directory ? (y/n) :
y
Will this be an ONLINE(1) or OFFLINE(2) Installation ? 
2
************************************************************************ 
Installing Package...
************************************************************************
Downloading Package...
Installing Package...
BigInsights Analyst RPMs have been extracted to the BigInsights/packages
        Directory
Installation Complete

What to do next

Select the service that you want to install and deploy. Even though your module might contain multiple services, install the specific service that you want and the BigInsights® Home service. Installing one value-add service at a time is recommended. Follow the service specific installation instructions for more information.

To see a suggested layout of services, see Suggested services layout for IBM Open Platform with Apache Hadoop



Feedback