Db2 Big SQL configuration utility

A utility is available to configure the Db2® Big SQL service on Cloudera Data Platform (CDP).

Usage notes

The Db2 Big SQL configuration utility bigsql-config is located in the /usr/ibmpacks/IBM-Big_SQL/7.1.0.0/bigsql-cli/ directory after the Db2 Big SQL top-level package is installed.

You must run the bigsql-config utility on the Db2 Big SQL Head node.

A superuser (root or sudo user) or the bigsql user can run the utility. If you run the utility as the bigsql user, the utility shows only the options that do not require superuser privilege. For options that only a superuser can run, you must log on as a superuser. Run the utility with the -help option to see the options that are available for each user.

Run the bigsql-config utility before you install the Db2 Big SQL service to set installation parameter values, such as Cloudera Manager connection information.

You also run the bigsql-config utility to configure security and whether you want to enable the integration of Db2 Big SQL with HBase. You can set these configuration options before or after you install Db2 Big SQL. After Db2 Big SQL is installed, you can run the utility directly from the shell after you log on to the bigsql account.

The following table describes the parameters that you can configure.

Table 1. Db2 Big SQL configuration parameters
Configuration parameter Description Default value
BIGSQL_USER Db2 Big SQL service user name. bigsql
BIGSQL_USER_ID Db2 Big SQL service numeric uid. 2824
BIGSQL_GROUP Db2 Big SQL service group name. hadoop
HIVE_USER Hive user name. hive
HIVE_GRP Hive group name. hive
HDFS_USER HDFS user name. Must be a valid HDFS superuser.

hdfs

HBASE_USER HBase user name (if HBase is installed). hbase
HADOOP_GRP Hadoop user group name. hadoop
USERS_GRP Proxy user group. users
SYSADMGRP_ID Db2 Big SQL service group numeric gid. 43210
SUDO_SSH_USER Sudo SSH user. root
SECURITY_TYPE User authentication type (ldap, kerberos, ldap_kerberos).

The default value is os, which means operating system local users.

os
CUSTOM_PAM_FILE If Pluggable Authentication Modules (PAM) is enabled, the PAM file location. This file must exist on all Db2 Big SQL nodes.  
DO_HIVE_WAREHOUSE_SETUP Specifies whether HDFS file Access Control Lists (ACLs) are created to grant permissions to the Db2 Big SQL service user across the Hive warehouse.
Notes: If you have a very large number of files and directories, this operation might take a long time to complete, and potentially exceed the timeout of the install.

If you have Ranger installed on the cluster, a Ranger policy is preferable to HDFS ACLs.

True
DO_NODE_MANAGER_SETUP Specifies whether to create the Db2 Big SQL user on NodeManager nodes.
Note: YARN jobs that are run by LOAD and ACID table compaction need the Db2 Big SQL user on NodeManager nodes. Change DO_NODE_MANAGER_SETUP=True if you plan to use these statements.
False
ALLOW_TIME_OUT Specifies whether to allow Db2 Big SQL worker nodes to time out during the installation process. False
ALLOW_FAILURES Specifies whether to allow Db2 Big SQL worker nodes to fail during the installation process. False
REQUIRED_WORKER_PERCENTAGE During the installation process, the minimum percentage of Db2 Big SQL workers that must successfully complete a step before continuing with the next step in the process. 100
TIME_OUT_LIMIT The maximum timeout limit, in minutes, after the last Db2 Big SQL worker completes the current step in the installation process. 5
VAR_DIR Directory where Db2 Big SQL puts files that are large or can potentially grow, such as log files. /var/ibm/bigsql
DBPATH Location of the Db2 Big SQL metastore.

If you are using GPFS, keep the database path local across all Db2 Big SQL nodes.

/var/ibm/bigsql/database
DATA_DIRECTORIES Location of the Db2 Big SQL data directories.

Best practice is to spread the directories across all of the same physical disks that are assigned to store Hadoop data. To specify multiple paths on multiple physical disks, separate each path with a comma.

If the use of local Db2 tables is anticipated, best practice is to ensure that the Db2 Big SQL head node also has physical disks mounted at the paths that are specified here.

You can have the directories on GPFS with symlinking.

/hadoop/bigsql
BIGSQL_PORT Db2 Fast Communication Manager (FCM) port number. 28051
DB2_PORT Db2 connection port number. 32051
DB2_SSL_PORT Port that Db2 uses to communicate with using SSL 32052
MLN_COUNT Number of Multiple Logical Nodes (MLNs).
Before or after you install Db2 Big SQL, you can specify a value greater than one to increase the number of Db2 Big SQL worker node partitions.
Note: To change the number of worker node partitions, you must use the Db2 Big SQL cluster administration utility.
1
BIGSQL_HA_PORT Port number used by Db2 Big SQL High Availability (HA). 20008
BIGSQL_HA_DB_BACKUP_DIR Db2 Big SQL database backup directory when you are adding a second Db2 Big SQL Head. /var/ibm/bigsql
CM_ADMIN_USER Cloudera Manager admin user name. admin
Note: If you use a non-default Cloudera Manager user, the user must have Cluster Administrator or Full Administrator privileges.
CM_HOST Cloudera Manager server host name.  
CM_PROTOCOL Protocol (http or https) to access Cloudera Manager. http
CM_PORT Cloudera Manager port number. 7180
CM_SSL_CA_CERTIFICATE_PATH When SSL is enabled for Cloudera Manager, the location of the truststore file.
Note: If Cloudera Manager is configured with its default auto TLS, set the value to /var/lib/cloudera-scm-agent/agent-cert/cm-auto-host_cert_chain.pem.
 
CM_BASE_CLUSTER When there is more than one cluster defined in Cloudera Manager, the name of the cluster where the HDFS, HIVE, Ranger, HBase, and Zookeeper services are installed.

This parameter is not required if there is only one cluster defined in Cloudera Manager.

 
CM_COMPUTE_CLUSTER When there is more than one cluster defined in Cloudera Manager, the name of the cluster where to install Db2 Big SQL.

If a value is not specified for this parameter, Db2 Big SQL is installed on the cluster that is specified in the CM_BASE_CLUSTER parameter.

This parameter is not required if there is only one cluster defined in Cloudera Manager.

 
BIGSQL_MEM_PERCENT Percentage of system memory that is assigned to Db2 Big SQL. 25
BIGSQL_SCHEDULER_ADMIN_PORT Db2 Big SQL scheduler admin port number. 7053
BIGSQL_SCHEDULER_SERVICE_PORT Db2 Big SQL scheduler service port number. 7054
BIGSQL_IMPERSONATION Enables support for impersonation. False
BIGSQL_PUBLIC_TABLE_ACCESS When impersonation is enabled, grant select and IUD to public on CREATE HADOOP TABLE statement. False
HBASE_ENABLED Enable support for HBase. False
RANGER_ENABLED Enable support for Ranger. False
ATLAS_ENABLED Enable support for Atlas. False
FC_ENABLED   False
LDAP_HOST LDAP server host name.  
LDAP_PORT LDAP server port number.  
LDAP_BASE_DN LDAP base DN.  
LDAP_BIND_DN LDAP bind DN.  
LDAP_TLS_ENABLED Enable TLS on the LDAP server.  
LDAP_ROOT_CA LDAP root CA.  
Note: When a function is available, use the function instead of the corresponding key value. For example, to enable Ranger, use the command bigsql-config -enableRanger. To enable HBase, use the command bigsql-config -enableHBase. To enable impersonation, use the command bigsql-config -enableImpersonation.

The log file for the utility is bigsql-config.log, and it is located in the /tmp directory.

Syntax

bigsql-config [options]

The following [options] can be included when the utility is run by a superuser:

-enableHBase
Enable support for HBase.
-disableHBase
Disable support for HBase.
-enablePam
Enable support for Pluggable Authentication Module (PAM).
-disablePam
Disable support for PAM.
-enableRanger
Enable support for Ranger.
-disableRanger
Disable support for Ranger.

The following [options] can be included when the utility is run by the bigsql user:

-display [all]
Display configurable parameters or all parameters.
-get key
Display the current value of a configuration key.
-set "key=value" [force]
Update the configuration key to the value.
-reset
Reset to default values. You can use this option only before you install Db2 Big SQL. After the Db2 Big SQL installation is complete, you can't reset to default values.
-enableAtlas
Enable support for Atlas.
-disableAtlas
Disable support for Atlas.
-enableImpersonation
Enable support for impersonation.
-disableImpersonation
Disable support for impersonation.
-help
Displays the help.

Examples

  1. Specify the FQDN of the Cloudera Manager host.
    ./bigsql-config -set "CM_HOST=<server_name>.com"
  2. Set multiple disks for the data by specifying a comma-separated list of mounted database directories:
    ./bigsql-config -set "DATA_DIRECTORIES=/disk_1/hadoop/bigsql,/disk_2/hadoop/bigsql,/disk_3/hadoop /bigsql"
  3. Enable support for Ranger:
    ./bigsql-config -enableRanger