Administering Spectrum Conductor clusters

You can manage and monitor the tools that are available with the Execution Engine for Apache Hadoop service.

Managing services

Periodically, the Spectrum Conductor admin must perform the following tasks for managing the Execution Engine for Apache Hadoop service:

Check the status of the services
In /opt/ibm/dsxhi/bin, run ./status.py to check the status of services.
Start the services
In /opt/ibm/dsxhi/bin, run ./start.py to start the services.
Stop the services
In /opt/ibm/dsxhi/bin, run ./stop.py to stop the services.
Check the default Jupyter Enterprise Gateway (JEG) environment
In /opt/ibm/dsxhi/bin, run ./spectrum_check_jeg_base.py. This will check to see if there is a valid default JEG environment call cpd_shared_jeg_base in the configured Anaconda instance.

Note: If this environment is not available, users can run: ./spectrum_check_jeg_base.py -u to re-submit or re-create the cpd_shared_jeg_base in the configured Anaconda instance.

Logs

  • The component logs are stored in /var/log/dsxhi.
  • The component PIDs are stored in /var/log/dsxhi, /var/log/livy2, and /opt/ibm/dsxhi/gateway/logs/.

The log level for the gateway service can be set by editing /opt/ibm/dsxhi/gateway/conf/gateway-log4j.properties and setting the appropriate level in the log4j.logger.org.apache.knox.gateway property.

Managing and configuring access for Watson Studio

To maintain control over the access to an Execution Engine for Apache Hadoop service, the Spectrum Conductor admin should maintain a list of known Watson Studio clusters that can access the service.

A Watson Studio cluster is recognized by its URL, which should be passed in when adding to, refreshing or deleting from the list of known clusters. A comma-separated list of Watson Studio clusters can be passed for each of the operations. Regardless of the order in which the arguments for add and delete are specified, the deletes are applied first and then the adds.

Once the cluster is added, removed or refreshed, the Spectrum Conductor admin must update the ascd configuration to update CPD_JWT_SECRET_KEY that references the specific Watson Studio cluster known by its topology mapping.

Working with the Watson Studio clusters

To add Watson Studio clusters to the list, do one of the following tasks:

  • In /opt/ibm/dsxhi/bin, run ./manage_known_dsx.py –-add "url1,url2...urlN". Once a Watson Studio cluster is added to the list, the necessary authentication is set up and a secure URLis generated for the Watson Studio cluster.

  • Or, if you want to specify your own topology name, you can add a –t flag, such as ./manage_known_dsx.py –-add url1 -t mywsl. This only works with one Watson Studio cluster at a time.

To refresh Watson Studio cluster in the list:

  • If the Watson Studio cluster was re-installed, you can refresh the information.In /opt/ibm/dsxhi/bin, run ./manage_known_dsx.py –-refresh "url1,url2...urlN".

To delete Watson Studio cluster from the list:

  • In /opt/ibm/dsxhi/bin, run ./manage_known_dsx.py –-delete "url1,url2...urlN".

To view the list of Watson Studio clusters:

  • In /opt/ibm/dsxhi/bin, run ./manage_known_dsx.py –list to view the Watson Studio clusters and the associated URL.

Spectrum Conductor configuration for ascd

Once you’ve updated the known list using the ./manage_known_dsx.py, you’ll need to configure the Spectrum Conductor ascd with the public keys for Watson Studio topologies that will connect to Spectrum Conductor.

First, determine the “topology” value for the Watson Studio cluster using ./manage_known_dsx.py -list. The output shows two columns: DSX Local Cluster URL and DSXHI Service URL.

DSX Local Cluster URL DSXHI Service URL
https://wsl35-cpd-wsl35.apps.mysystem.ibm.com https://myspectrum.ibm.com:8444/gateway/wsl35-cpd-wsl35
https://wsl35v2-cpd-wsl35v2.apps.fast.ibm.com https://myspectrum.ibm.com:8444/gateway/wsl35v2-cpd-wsl35v2

Take the URL in the DSXHI Service URL and extract the last segment of the URL path. That string is called the topology. For example: topology is wsl35-cpd-wsl35 for the URL https://myspectrum.ibm.com:8444/gateway/wsl35-cpd-wsl35.

In Spectrum Conductor, find the ascd.conf file. The file should be located at: $EGO_CONFDIR/../../ascd/conf/ascd.conf.

In ascd.conf, configure the new parameter CPD_JWT_SECRET_KEY. The value for it is in the format: <Topology Name>;<Public Key location>,<Topology Name>;<Public Key Location>,<REPEAT>.

To get the public key for a topology perform from a terminal:

export JWT_PUBLIC_KEY=”/opt/ibm/jwtCert_wsl35v2”
export WS=”wsl35v2-cpd-wsl35v2.apps.fast.ibm.com”
curl -k  https://${WS}/auth/jwtpublic > ${JWT_PUBLIC_KEY}

An example of an update in ascd.conf is: CPD_JWT_SECRET_KEY=wsl35v2-cpd-wsl35v2;/opt/ibm/jwtCert_wsl35v2.

Once you’ve updated the ascd.conf file, restart ascd for the new configuration to take effect.

Watson Studio admin task

After the Hadoop administrator adds a Watson Studio cluster to the known list maintained by the Execution Engine for Apache Hadoop service and sets up the Spectrum Conductor ascd configuration, the Watson Studio admin can register the service on the Watson Studio cluster using the secure URL and the service user. Learn how to register Hadoop clusters.

Spectrum Conductor instance group configuration

The Spectrum admin must add a single-file package to instance groups that are to be used by users that plans to launch their Jupyter Notebook from Watson Studio.

The admin first creates an instance group with the appropriate configuration settings. Then, download the launch_ipykernel.py file that is bundled with Execution Engine for Apache Hadoop from a compute node in the Spectrum Conductor cluster. This file is located in: /opt/ibm/dsxhi/jeg/kernelspecs/kernels/spark_python_conductor_cluster/scripts/launch_ipykernel.py.

From the Spectrum Conductor UI, select the instance group, and click Configure. On the Packages tab, click Create Single-File Packages. Use the launch_ipykernel.py to create the package. Leave the defaults except change the deploy location from $SPARK_HOME/jars to $SPARK_HOME/scripts.

Configuring the Spark options in Cloud Pak for Data

The Spectrum admin can provide configurations that allow users to provide additional Spark parameters or adjust these parameters when they create a Spark session through Jupyter Notebook connecting to Spectrum Conductor. This allows the admin control over Spark sessions that are submitted through Execution Engine for Apache Hadoop.

Once the additional configurations are defined, a list of parameters and keys are available and data scientists can update the settings from the Spectrum environment.

Setting up a list of Spark configuration parameters

To configure additional parameters or adjust default configuration values for Spark, the admin must modify the yarnJobParamsSPECTRUM.json file that is located in: /opt/ibm/dsxhi/conf/yarnJobParamsSPECTRUM.json.

To modify the file:

  1. Back up the file before you modify it.

  2. Modify the file. See these details and examples for more information.

  3. Save the file.

  4. Verify that the content is still a valid json file by using the following command:

     cat yarnJobParamsSPECTRUM.json | python -m json.tool
    
    

Confirm that the command returns a json formatted object. 5. IMPORTANT: If you have a Cloud Pak for Data cluster that has the Spectrum cluster registered, ask your Cloud Pak for Data admin to refresh the registration details page. Refreshing allows the new configurations to be retrieved and used.

Details on the content of the json files

The file contains three main sections:

  • scriptLanguages: This is not an option for Spectrum Conductor.

  • jobOptions: These are options that are tied to the UI and should not be removed. The values and bounds can be modified. The description of each entry describes what it is used for. The ones of interest are:

    • Available executor memory: Changing the bounds will be reflected in the UI
    • Available driver memory: Changing the bounds will be reflected in the UI

    Note: Some of these options are not used, as these are to be configured with the instance group.

  • extraOptions: These are extra options for Spark that the Spectrum admin can set either with a default value value or not. If value is specified, this value is always used when Cloud Pak for Data issues a call to create a JEG. When these sessions are created, options are translated to --conf option=value for JEG.

If a specific Spark option is not listed, it can be added in as an entry. Review the Spark configuration options for the specific version of Spark 2.x that is running on the Spectrum Conductor.

Example: Increase “Available driver memory” to allow higher memory allocation

                {
                        "description": "Available driver memory",
                        "displayName": "Driver memory (in MB)",
                        "max": "10240",
                        "min": "1024",
                        "name": "spark.driver.memory",
                        "type": "int",
                        "labels": [ "spark" ]
               }

Example: If users need to tune spark.memory.fraction, add this entry at the extraOptions section:

               {
                       "name": "spark.memory.fraction",
                       "type": "float",
                       "min" : "0.5",
                       "max" : "0.9999",
                       "value": ""
                }

If the admin determines that the default spark.memory.fraction should always be 0.9 if it's not specified, set the value field to 0.9.

               {
                       "name": "spark.memory.fraction",
                       "type": "float",
                       "min" : "0.5",
                       "max" : "0.9999",
                       "value": "0.9"
                }

Parent topic: Administering Execution Engine for Apache Hadoop