Template parameters for installing the service on Apache Hadoop clusters

When you are installing Execution Engine for Apache Hadoop you'll need to create a /opt/ibm/dsxhi/conf/dsxhi_install.conf file. You can use the following template files as reference when you are creating the file.

Create a /opt/ibm/dsxhi/conf/dsxhi_install.conf file by using the parameters from the following templates as a reference:

dsxhi_install.conf.template.CDH

Property Examples
Mandatory or Optional
dsxhi_license_acceptance= dsxhi_license_acceptance=A
Optional
Specify 'A' or 'a' for accepting the license.
Specify 'R' or 'r' for rejecting the license.
If the property is empty, the user is prompted during installation.
dsxhi_serviceuser=
dsxhi_serviceuser_group=
Mandatory
Specify the username and group of the user (dsxhi service user) running the dsxhi service.
dsxhi_serviceuser_keytab=
dsxhi_spnego_keytab=
If the CDH cluster is kerberized, it is mandatory to specify the complete path to the keytab for the dsxhi service user and the spnego keytab.
If the CDH cluster is not kerberized, these properties should be left blank.
dsxhi_gateway_port= dsxhi_gateway_port=8443
Mandatory
Specify the port number for the dsxhi gateway service. This port should be accessible externally.
dsxhi_rest_port= dsxhi_rest_port=8082
Mandatory
Specify the port number for the dsxhi rest service.
cluster_manager_url=
cluster_admin=
cluster_manager_url=http://cdhcluster1:7180
cluster_admin=admin
Mandatory
Specify the Cloudera Manager URL and admin username for the CDH cluster. The user is prompted for the password during installation. If the URL is not specified, some pre-checks are not be performed before the dsxhi installation.
cluster_name= Optional
Specify the cluster name for the CDH if multiple clusters are configured by using a single Cloudera Manager.
exposed_hadoop_services= exposed_hadoop_services=webhdfs,livyspark,jeg
Optional
Specify the Hadoop services that dsxhi service should expose.
existing_livyspark_url= existing_livyspark_url=http://cdhcluster:8999
Optional
If the CDH cluster has Livy for Spark configured, specify the URL.
dsxhi_livyspark_port= dsxhi_livyspark_port=8999
Mandatory
Specify the port number that dsxhi service should use when installing and configuring Livy for Spark.
dsxhi_jeg_port= dsxhi_jeg_port=8888
Mandatory
Specify the port number for the dsxhi JEG service.
known_dsx_list= known_dsx_list=https://dsxlcluster1.ibm.com,https://dsxlcluster2.ibm.com:31843
Optional
Specify the list of URL for the dsx local clusters that registers this dsxhi service. The URL should include the port number if necessary.
package_installer_tool= Optional
Specify an installer tool to install packages supported tools are yum, rpm, dnf. This option should be set to use the install_package script.
systemctl_enable= systemctl_enable=False
Optional
Specify True or true to systemctl enable dsxhi.
packages= packages=lapack
Optional
Specify packages to install. You can provide multiple packages in the same line and comma separated, such as package1,package2. This option should be set to use the install_package script.
cluster_nodes= Optional
Specify the Hadoop cluster hosts that you want to install packages on. You can provide multiple hosts on the same line and comma separated, such as host1,host2,host3. If this option is set, packages are installed on specified hosts or packages are installed on all node manager hosts of the Hadoop cluster.
cluster_ssh_user= Optional
Specify users who can ssh to the Hadoop cluster by using the ssh key and install packages. This option should be set to use the install_package script.
cluster_ssh_key_path= Optional
Specify the path to ssh private key or certificate that is used to ssh to Hadoop cluster nodes to install packages. This option should be set to use the install_package script, such as /root/.ssh/id_rsa.
hive_jdbc_client_url= hive_jdbc_client_url=jdbc:hive2://remotehost:port
Optional
Provide the client-side Hive JDBC URL.
custom_jks= Optional
Custom jks provided by a user. It's used for gateway, JEG, and web services. If it's provided, dsxhi generates the required .crt file to add to the java truststore. If it's not provided, dsxhi generates a .jks and successively generates a .crt to add to the java truststore.
dsxhi_cacert= Optional
Custom CACERT is provided by a user. If it's not provided, dsxhi attempts to detect the default CACERT on the system.
add_certs_to_truststore= add_certs_to_truststore=True
Optional
Option to allow dsxhi to add certs to the truststore.
True (default): Allow dsxhi to add host certificate to Java truststore on detected data nodes for gateway and web services.
False: The user is expected to have added a host certificate to the Java truststore with a default host alias on the generated .crt file.

dsxhi_install.conf.template.SPECTRUM

Property Examples
Mandatory or Optional
dsxhi_license_acceptance= dsxhi_license_acceptance=A
Optional
Specify 'A' or 'a' for accepting the license.
Specify 'R' or 'r' for rejecting the license. If the property is empty, the user is prompted during the installation.
dsxhi_serviceuser=
dsxhi_serviceuser_group=
Mandatory
Specify the username and group of the user (dsxhi service user) running the dsxhi service.
cluster_manager_url=
cluster_admin=
cluster_manager_url=https://cluster:8643/platform/rest/conductor/v1
cluster_admin=Admin
Mandatory
The cluster_manager_url parameter is set to the conductor rest endpoint. The endpoint is used to allow remote system to connect to conductor for administration purposes. The admin info is used only on install time or to create a base JEG environment.
conductor_secure_curl_options= conductor_secure_curl_options=--cacert /opt/ibm/spectrumcomputingMain/wlp/usr/shared/resources/security/cacert.pem
Mandatory
Specify the curl options that is needed to connect to the conductor endpoint.
conductor_endpoint_ha_list= conductor_endpoint_ha_list=host1,host2
Mandatory
Specify a comma-separated list of master candidate fail-over hosts that is used in case the current defined cluster manager URL is not available.
conductor_anaconda_inst_uuid= conductor_anaconda_inst_uuid=22aba77e-dd61-4cb7-a0fa-48af603125ae
Mandatory
Specify a comma-separated list for the Anaconda instance in which the Watson Studio environments are created in.
dsxhi_gateway_port= dsxhi_gateway_port=8443
Mandatory
Specify the port number for the dsxhi gateway service. This port should be accessible externally.
dsxhi_rest_port= dsxhi_rest_port=8082
Mandatory
Specify the port number for the dsxhi rest service.
exposed_hadoop_services= exposed_hadoop_services=jeg
Spectrum only supports JEG.
dsxhi_jeg_port= dsxhi_jeg_port=8888
Mandatory
Specify the port number for the dsxhi JEG service.
known_dsx_list= known_dsx_list=https://dsxlcluster1.ibm.com,https://dsxlcluster2.ibm.com:31843
Optional
Specify the list of URL for the dsx local clusters that will register this dsxhi service. The URL should include the port number if necessary.
systemctl_enable= systemctl_enable=False
Optional
Specify True or true to systemctl enable dsxhi.
custom_jks= Optional
The custom jks is provided by the user. It's used for gateway, JEG, and web services. If it's provided, dsxhi generates the required .crt file to add to the java truststore. If it's not provided, dsxhi generates a .jks and successively generates a .crt to add to the java truststore.
dsxhi_cacert= Optional
The custom CACERT is provided by the user. If it's not provided, dsxhi attempts to detect the default CACERT on the system.
add_certs_to_truststore= add_certs_to_truststore=True
Optional
Option to allow dsxhi to add certs to the truststore.
True (default): Allow dsxhi to add a host certificate to the Java truststore, on detected data nodes for gateway and web services.
False: The user is expected to have added a host certificate to the Java truststore with a default host alias on the generated .crt file.