Installing the service on Spectrum Conductor clusters
Before a project administrator can install Execution Engine for Apache Hadoop on the Spectrum Conductor cluster, the service must be first installed on Cloud Pak for Data. Review and confirm that you meet the following requirements and are aware of the supported Spectrum conductor versions and platforms before you install the service:
- System requirements
- Installation steps for non-root users
- Installing the service
- Configuring custom certificates
System requirements for installing Execution Engine for Apache Hadoop
- Spectrum Conductor
- Version 2.5.0 or higher
- Platform: x86 with RHEL 7.x
- Anaconda instances setup with Miniconda version 4.6.16
- Edge node hardware requirements
- 8 GB memory
- 2 CPU cores
- 100 GB disk, mounted and available on /var in the local Linux file system. The installation creates the following directories and these locations are not configurable:
- To store the logs:
- To store the process IDs:
- To store the logs:
- 10 GB network interface card recommended for multi-tenant environment.
- Edge node software requirements
- Python 2.7 or higher
- Java JRE 1.8x
- An external port for the gateway service.
- An internal port for the Hadoop Integration service.
- Internal ports for the Jupyter Enterprise Gateway service.
- Service user requirements
The Execution Engine for Apache Hadoop service runs as a service user. This user must be a valid Linux user on the node where the Execution Engine for Apache Hadoop service is installed.
If you install the service on multiple edge nodes for high availability, the same service user should be used.
This user should have the Cluster administrator role or the Consumer administrator role with context of the root Consumer (
/) to be able to create Spectrum Conductor Anaconda environments.
Spectrum conductor cluster setup
The Execution Engine for Apache Hadoop service adds environments to existing Anaconda distribution instances that were defined in Spectrum Conductor. As part of the setup, there must be an Anaconda distribution instance using Conda version 4.6.14, or if you plan to use custom images, you’ll need to provide an Anaconda distribution instance that has the same Conda version as the custom image.
You’ll also need to provide the uuid that is associated with the Anaconda distribution instance in the
dsxhi_install.conffile, as part of installation configuration.
Installation steps for non-root users
If you plan to install the Execution Engine for Apache Hadoop service as a non-root user, the following permissions should be granted using the visudo command:
Steps for DSXHI non-root installation:
- Apply visudo rules for non-root user
- sudo yum install
- sudo chown
<non-root_user:non-root_user> -R /opt/ibm/dsxhi/
- sudo python
## DSXHI <non-root_user> ALL=(root) NOPASSWD: /usr/bin/yum install <path-to-rpm/rpm>, /usr/bin/yum erase dsxhi*, /usr/bin/chown * /opt/ibm/dsxhi/, /usr/bin/python /opt/ibm/dsxhi/*
Watson Studio interacts with a Spectrum Conductor cluster through the following services:
Watson Studio user
Every user that is connecting from Watson Studio must be a valid user on the Spectrum Conductor cluster. The recommended way to achieve this is by integrating Watson Studio and the Spectrum Conductor cluster with the same LDAP.
Installing the service
- Run the RPM installer. The rpm is installed in
- If you’re running the install as the service user, run
sudo chown <serviceuser\> -R /opt/ibm/dsxhi.
- Create a
/opt/ibm/dsxhi/conf/dsxhi_install.conf.template.SPECTRUMfile as a reference.
Fill in the
dsxhi_install.confbase on your Spectrum conductor configuration. Use the template to help because it describes what is needed for each field. If you need to use your own custom certificates, see Configuring custom certificates.
Optional: If you need to set additional properties to control the location of Java, use a shared truststore, or pass additional Java options. Update the
/opt/ibm/dsxhi/conf/dsxhi_env.shscript to include the appropriate values for the environment variables:
/opt/ibm/dsxhi/bin, run the
./install.pyscript to install the service. The script prompts for inputs on the following options (alternatively, you can specify the options as flags):
Accept the license terms (Hadoop registration uses the same license as Watson Studio). You can also accept the license through the
You are prompted for the password for the Spectrum Conductor REST endpoints. The value can also be passed through the
For the master secret for the gateway service, the value can also be passed through the
The Java cacerts truststore password is prompted. The value can be passed through
Optional: If the
dsxhi_install.confis used, provide the password associated with this file. The value can be passed through
After the service is installed, the necessary components, such as the gateway service, DSXHI integration services, and Jupyter Enterprise Gateway services are started.
Configuring custom certificates
You can use your existing certificates and not have to modify the system truststore. The following configuration properties convert DSXHI to do the following customizations:
- DSXHI typically generates a Keystore, converts it to a
.crt, and adds the
.crtto the Java Truststore. However, with this configuration, DSXHI allows you to provide a custom Keystore that can be used to generate the required
- DSXHI previously detected the appropriate truststore to use as part of the installation. With the
dsxhi_cacertproperty, DSXHI allows you to provide any custom truststore (CACERTS), where DSXHI certs are added.
- This configuration provides options to either add the host certificate to the truststore yourself or DSXHI adds it. If you set the configuration to False, users must add the host certificate to the truststore themselves. DSXHI doesn’t make any changes to the truststore. If you set the configuration to True, DSXHI retains its default behavior to add host certificate to java truststore and on detected datanodes for gateway and web services.
See Uninstalling the service on a Spectrum Conductor cluster for information on uninstalling the service.