Installing Db2 Big SQL on CDH

This information describes how to add the Db2® Big SQL service on a Cloudera Distribution Including Apache Hadoop (CDH) cluster.

Before you begin

Make sure that you follow the prerequisite steps that are listed in Requirements for installing Db2 Big SQL.

Recommended: Manually run the Db2 Big SQL pre-installation check utility on the candidate Db2 Big SQL Head node, and passing in all other candidate Db2 Big SQL nodes. Review and address any issues before you install Db2 Big SQL.

To run this utility, you must be logged on as root, a non-root user with sudo privileges, or the Db2® Big SQL service user. In this release, logging on as a non-default bigsql user is not supported.

If HBase is installed on your cluster and you want Db2 Big SQL to access HBase, you must install each worker on either an HBase Region Server node or an HBase client node.

When you install Db2 Big SQL, there are guidelines that you can follow to optimize performance. For more information, see Cluster and system tuning.

Procedure

  1. On the candidate Db2 Big SQL Head node, create the host list file /tmp/bigsqlHostList that contains the fully qualified domain name (FQDN) of each node in your cluster.

    Put each host name on a single line, starting with the target installation node for the Db2 Big SQL Head. The remaining host names become Db2 Big SQL worker nodes.

  2. Edit the installation configuration parameters to match your cluster environment by running the Db2 Big SQL configuration utility.
    1. Type the following commands to view the configuration parameters and their default values:
      cd /usr/ibmpacks/IBM-Big_SQL/5.0.4.0/bigsql-cli/
      ./bigsql-config -display
    2. Modify installation parameters that are not accurate for your cluster.
      ./bigsql-config -set "key=value"

      At a minimum, modify the following configuration parameters:

      • CM_HOST
      • HDFS_USER

      For CM_HOST, specify the fully qualified domain name (FQDN) of the host. For HDFS_USER, make sure that the specified value is a valid HDFS superuser. If the cluster is kerberized, ensure that kinit was run in the HDFS superuser account. The Db2 Big SQL installer uses the HDFS superuser to create a directory on HDFS.

      For example,
      ./bigsql-config -set "CM_HOST=<your.cloudera.manager.server.com>"
  3. Using the same utility, update configuration for the Db2 Big SQL service.

    You can enable or disable the following options:

    • HBase (if HBase is installed)
    • Impersonation
    • Pluggable Authentication Module (PAM)

    For more information about these options, run the following command:

    bigsql-config -help
    Note: You can also do this step after Db2 Big SQL is installed.
  4. Install Db2 Big SQL by running the bigsql-install utility.
    cd /usr/ibmpacks/IBM-Big_SQL/5.0.4.0/bigsql-cli/
    ./bigsql-install
  5. When prompted, enter passwords for the bigsql and Cloudera Manager admin users.
    Note: If the cluster is kerberized, Db2 Big SQL is automatically kerberized during the installation process. You can manage the Db2 Big SQL service keytabs in Cloudera Manager.
  6. If Sentry is enabled on the cluster, run the following commands from the Hive server:
    create role bigsql_role;
    grant role bigsql_role to group <bigsql_group>;
    grant all on server server1 to role bigsql_role with grant option;

    <bigsql_group> is the group of the bigsql user. To obtain <bigsql_group>, run the following command:

    ./bigsql-config -get SYSADM_GRP
  7. Validate your install with the Db2 Big SQL cluster administration utility.
    cd /usr/ibmpacks/IBM-Big_SQL/5.0.4.0/bigsql-cli/
    ./bigsql-admin -smoke
    Tip: To run data load tests, run the command /usr/ibmpacks/IBM-Big_SQL/5.0.4.0/bigsql-cli/BIGSQL/package/scripts/bigsql-smoke.sh -l.

Results

If there is an error during the installation process, you can fix the error and run the installation utility again. Review installation log files in the /tmp/<bigsql_user>/logs directory (for example, /tmp/bigsql/logs). You can run the utility as many times as needed. When you rerun the utility, the utility resumes from the last successful step.

What to do next

Optional: Installing the Db2 Big SQL console on CDH