Requirements for installing Db2 Big SQL

There are requirements that you must meet before you can install Db2® Big SQL on a Hortonworks Data Platform (HDP) or Cloudera Distribution Including Apache Hadoop (CDH) cluster.

Note: The user names and home directory locations that are used here might vary depending on your user environment. Adapt these instructions to match your local user environment.

File system requirements

Ensure that /home is not mounted with the nosuid parameter.
You can run the mount command in the Linux® command line with no options to display information about all known mount points.
Ensure that the following directories have the required disk space available.

Db2 Big SQL Home directory

The disk space that is required for the instance home directory is calculated at run time and varies. Approximately 1 to 1.5 GB of free space in the /home directory is normally required.

/var

512 MB of free space

/tmp

5 GB of free space

/usr

Minimum of 5 GB of free space, or create a mount point /usr/ibmpacks with at least 5 GB of free space

You must also remember to include disk space for required databases, software, and communication products. Ensure that the file system is not mounted with concurrent I/O (CIO) option.

If the temporary directory (default is /tmp) is a mounted drive, all users must be able to directly execute binaries. For example, if /tmp is a mounted drive, do not specify the noexec parameter with the mount command when you create the drive.
(CDH only) For SSL clusters, grant the directory and the files that contain the certificates and keys for the ssh user that is doing the install a permission of 744.
Without the 744 file permission, you might see permission denied errors to SSL files during the installation process. For more information, see Access errors when installing on a secure CDH cluster.
Disable prelinking for shared libraries. On RHEL6 only, prelinking is enabled by default; on RHEL7 it is disabled by default. If in the prelink file the value of PRELINKING is yes, prelinking must be disabled by following these steps:
1. As the root user, edit the prelink configuration file with the command
```
sudo vim /etc/sysconfig/prelink
```
2. In the prelink file, change the PRELINKING value from yes to no.
3. As the root user, run the command
```
sudo prelink -ua
```
For more information, see the technote BigSQL fails to install or start with SQL0901 on RHEL6 machines with more than 256GB RAM.

Software requirements

Download, install, and deploy HDP or CDH.
- Db2 Big SQL supports HDP 2.6.0 to 2.6.5 on Ambari 2.5.1 and higher, CDH 5.12 to 5.16, 6.2, and 6.3.
  For more information on hardware and software requirements, see System requirements for IBM Big SQL and IBM BigInsights.
- Make sure that the same version of HDP or CDH is installed on all nodes. Multiple versions can cause problems.
- You must be an administrator of the HDP or CDH and have root access to the cluster, or be a non-root user with passwordless sudo privileges, as specified in Configuring non-root access to Db2 Big SQL.
- The Ambari (HDP) or Cloudera Manager (CDH) server must be running.
(CDH only) JDK 8 must be installed on all Cloudera hosts in either the /usr/java or /usr/lib/jvm directory.
Recommended: Remove all previous JDK versions from these directories on all Cloudera hosts, or in Cloudera Manager, set JAVA_HOME to JDK 8.
Install the package ksh on all nodes:
```
yum install ksh
```

Database requirements

If you plan to connect to another database server through ODBC, be sure to create Db2 Big SQL user IDs and passwords that are valid on the other database server. For example, the Db2 ODBC driver limits the length of passwords to 17 characters, and longer passwords are truncated.

Cluster requirements

Make sure that all host names are listed as lowercase in either the /etc/hosts file or the DNS configuration.
The long and short format of the host names on which Db2 Big SQL resides must both resolve to the underlying IP address from all said hosts. You can ensure this by making sure the /etc/hosts file contains host names in both long and short format, or through DNS.
Servers that host Db2 Big SQL nodes can have different hardware and operating systems. However, there are some important things to bear in mind:
1. If you plan to have Db2 Big SQL primary and secondary head nodes, the servers that host these nodes must have the same operating system and operating system version.
2. The severs that host Db2 Big SQL must have the same processor architecture - a mixture of Power and Intel servers cannot be used in the same Db2 Big SQL cluster.
3. All hosts on which Db2 Big SQL resides must have the same number of local disks, so that the disks referenced in the Db2 Big SQL Data directories configuration property is consistent on all Db2 Big SQL hosts.
4. While different hardware is supported, it is generally recommended for optimal performance and resource allocation that the same hardware be used for the servers hosting Db2 Big SQL nodes. For example, adding a less powerful server to an existing cluster comprised of more powerful hosts is not recommended, as the new host server might then become a resource bottleneck and adversely impact Db2 Big SQL performance.
If a firewall is running, make sure that it is set up to allow Db2 Big SQL access to its ports.
Install the Db2 Big SQL service with at least two nodes in the cluster to see the best performance, with at least one node designated as the Db2 Big SQL master.
On all nodes, in the /etc/sudoers file, if the line Defaults requiretty exists, comment it out by using a # prefix:
```
#Defaults requiretty
```
Precreate the Db2 Big SQL service ID, or let the installer create it (as bigsql). If you precreate the Db2 Big SQL service ID (locally, as bigsql, or a non-default user ID), ensure that the bigsql UID or the non-default UID is the same across all of the nodes. You can determine the UID for each node with the following command:
```
id <Db2 Big SQL username> 
```
- If you precreate the bigsql user ID, make sure that the home directory path for Db2 Big SQL (such as /home/bigsql) is the same on all nodes.
- If you precreate a user ID other than bigsql, make sure that the home directory path for Db2 Big SQL (such as /home/notbigsql) is the same on all nodes.
- If you precreate the bigsql user ID, note the user's uid and enter it in Ambari during the installation process.
If you precreate the Db2 Big SQL service ID, the ID must meet specific requirements. For more information, see Users, groups, SSH keys, and ports.
Set up passwordless SSH for the Db2 Big SQL service user, or let the installer do it.
At runtime, Db2 Big SQL requires passwordless SSH for the Db2 Big SQL service user ID, which, by default, is set up by the installer. However, if you pre-created the Db2 Big SQL service user, you can pre-configure passwordless SSH for the service ID (for example, if you wanted to use a custom key location). If you choose this option, make sure that the Db2 Big SQL service ID has passwordless SSH access from all Db2 Big SQL nodes (head and workers) to all Db2 Big SQL nodes in the cluster.
(HDP) Ensure that the following steps are completed for the user that will run the Ambari server and agents on a cluster where Ambari and Db2 Big SQL are to be deployed.
(CDH) Ensure that the following steps are completed by the user that will run the Db2 Big SQL installation.

Note: This user can be either root, or a non-root user that was set up in the /etc/sudoers file, as described in Configuring non-root access to Db2 Big SQL.
1. Set up passwordless SSH from the Db2 Big SQL head node to itself and to all data nodes.
2. Set up passwordless SSH from the head node to the nodes that are hosting the Ambari (HDP) or Cloudera Manager (CDH) server, HDFS NameNode, Hive Server2, and Hive Metastore processes.
Install the HDFS, Hive, YARN, and Sqoop services, and verify that they are running fine. Optionally, install the HBase service.
Increase the default value of the YARN Resource Manager heap size.
To avoid OOM or Java™ heap space problems on fairly busy clusters, set the heap size to 4096 MB.
Confirm that you have Hive metastore connectivity from the node where Db2 Big SQL will be installed, even if Db2 Big SQL will be on the same node as Hive. You can test this connectivity by opening the Hive shell from the command line and running a simple command. Do the following steps:
1. Authenticate to hive:
  HDP: su - hive
  
  CDH: su -s /bin/bash - hive
2. Open the HIVE shell by typing the following from the command line:
```
hive
```
3. Run a command that displays tables, such as:
```
hive> show tables;
```
Make sure that directories that are specified for the following services contain only the paths that you want to use. The directories should not be in the root (/) partition or on any other disk that you do not want them to be in.
- HDFS
- YARN
- Zookeeper
- Kafka
Tip: In general, review all directories that are automatically filled in by Ambari (HDP) or Cloudera Manager (CDH) when a component is installed.
Install the following components on the Db2 Big SQL head node. You can add them using Ambari (HDP) or Cloudera Manager (CDH).
1. For the core operation of Db2 Big SQL: HDFS, Hive, HCat, YARN, and Sqoop clients
2. (HDP only) To take advantage of the YARN integration capabilities (YARN is enabled): Slider client
3. To take advantage of the HBase integration capabilities if HBase is installed: HBase client
4. (HDP only) To take advantage of the Atlas integration capabilities if Atlas is installed: Atlas client
5. To take advantage of the Spark integration capabilities if Spark is installed: Spark client
For regular Db2 Big SQL worker nodes, install the following components. You can add them using Ambari (HDP) or Cloudera Manager (CDH).
1. For the core operation of Db2 Big SQL: DataNode, HCat client, Hive client (CDH)
2. (HDP only) To take advantage of the YARN integration capabilities (YARN is enabled): NodeManager
3. To take advantage of the HBase integration capabilities if HBase is installed: RegionServer
4. (HDP only) To take advantage of the Atlas integration capabilities if Atlas is installed: Atlas client
5. To take advantage of the Spark integration capabilities if Spark is installed: Spark client
For edge Db2 Big SQL worker nodes, install the following components. You can add them using Ambari (HDP) or Cloudera Manager (CDH).
1. For the core operation of Db2 Big SQL: HDFS client, HCat client, Hive client (CDH)
2. (HDP only) To take advantage of the YARN integration capabilities (YARN is enabled): NodeManager
3. To take advantage of the HBase integration capabilities if HBase is installed: HBase client
4. (HDP only) To take advantage of the Atlas integration capabilities if Atlas is installed: Atlas client
5. To take advantage of the Spark integration capabilities if Spark is installed: Spark client
Make sure that these services are running:
- Hive
- HDFS
- YARN
- MapReduce2 (MRv2)
- HBase (if installed)
- Sqoop
- (HDP only) Knox, and the LDAP server is started if you use LDAP. If you are not using the LDAP server, start the Knox Demo LDAP service, as described in the following step.
- (CDH only) Sentry (if installed)
(CDH only) If your cluster is kerberized:
1. A valid ticket for the hdfs user is required on the target install node for the Db2 Big SQL Head. Make sure that the kinit command is run in the HDFS superuser account. The Db2 Big SQL installer uses the HDFS superuser to create a directory on HDFS.
2. If HBase is installed, and you plan to configure Db2 Big SQL for HBase during the installation, a valid ticket for the hbase user is required on the target install node for the Db2 Big SQL Head. Make sure that the kinit command is run in the HBase superuser account.
(HDP only) Make sure that the Knox service is started, and that the LDAP server is started.
Knox requires that LDAP is running, even if your cluster is not configured for LDAP. The Knox service provides a Demo LDAP server by default.
1. Click the Knox service.
2. In the Summary tab, click Service Actions, and find the Start Demo LDAP option in the drop-down menu.
(HDP only) To ensure that all Db2 Big SQL operations succeed when Kerberos is enabled, verify that the users who are referenced in the Demo LDAP configuration also exist as operating system users on all nodes of the cluster.
- If you create the Demo LDAP users on the operating system, such as the guest user, make sure that the UID assigned to the user is the same on all nodes.
- Ensure that the user UID is greater than the value set in the YARN configuration for Minimum user ID for submitting job. By default, the value is set to 1000.