Table of contents

Managing Analytics Engine powered by Apache Spark instances

On the details page of an instance, you can view information related to a Spark instance, manage user access to the instance or delete the instance. A user with Administrator or Developer role can view instance details.

To manage a service instance for Analytics Engine powered by Apache Spark:

  1. From the Navigation menu on the IBM Cloud Pak for Data web user interface, click Services > Instances, find the instance and click it to view the instance details. These include:

    • The storage claim name
    • The endpoint to the Spark History server user interface.
  2. If spec.serviceConfig.sparkAdvEnabled is enabled in the Analytics Engine custom resource (CR), you will see:

    • The name of the deployment space
    • The URL to view the deployment space in which the Spark jobs run. Copy this URL to a new browser window to open the deployment space and see your Spark jobs
  3. From the options menu on the right side of the window, you can:

    • Manage access: Only a user with Administrator role can manage user access to the Analytics Engine powered by Apache Spark instances. From here, an administrator can grant users Developer role to the instance so that they can submit Spark jobs. See Managing user access.
    • Delete: Only a user with Administrator role can delete an Analytics Engine powered by Apache Spark instance.

      Important: If spec.serviceConfig.sparkAdvEnabled is set to true in the custom resource (CR), you must delete the deployment space that is associated with the instance if you want to create an instance again with the same name. Note that when you delete the deployment space, you will also delete all assets and jobs in that space.

      To delete a deployment space:

      1. From the Navigation menu on the Cloud Pak for Data web user interface, click Deployments.
      2. On the Spaces tab, search for the space named <InstanceName>_space. From the Actions menu on the right, select Delete.

      Note: The data files in the instance user’s home directory, which is created at the time the Analytics Engine powered by Apache Spark instance is provisioned, are not deleted when the instance is deleted. You must delete this data yourself.

Generating an access token

All users must generate their own access token to use the Spark jobs API. Users can either:

  • Get a bearer token with IAM integration disabled by typing this command:
    curl -k -X POST https://cpd_cluster_host/icp4d-api/v1/authorize -H 'cache-control: no-cache' -H 'content-type: application/json' -d '{"username":"admin","password":"password"}'
    

    Where you specify:

    • cpd_cluster_host as the URL for the Cloud Pak for Data cluster
    • Your user name and password for accessing the Cloud Pak for Data cluster

    The call returns a JSON snippet from which the bearer token can be extracted from the access_token field:

      {
        "username": "admin",
        "role": "Admin",
        "permissions": [
          "administrator"
        ],
        "sub": "admin",
        "iss": "KNOXSSO",
        "aud": "DSX",
        "uid": "999",
        "authenticator": "default",
        "access_token": "eyJraWQiOiIyMDE3MDgwOS0wMDowMDowMCIsImFsZyI6...",
        "_messageCode_": "success"
        ....
      }
    
  • Get a bearer token with IAM integration enabled by using the IBM Cloud Pak foundational services URL. To get this URL, refer to Finding the IBM Cloud Pak foundational services URL. You need the foundational services URL in the following cURL command.

    1. Obtain the temporary IAM access token:
       curl -k -X POST -H "Content-Type: application/x-www-form-urlencoded;charset=UTF-8" \
       -d "grant_type=password&username=<username>&password=<password>&scope=openid" \
       <foundational-services-url>/idprovider/v1/auth/identitytoken
      
    2. Using the IAM access token, request the bearer token:
       curl -k X GET 'https://cpd_cluster_host/v1/preauth/validateAuth' \
       -H 'username: admin' \
       -H 'iam-token: <iam-token>'
      

Finding the IBM Cloud Pak foundational services URL

The IBM Cloud Pak foundational services URL is the OpenShift route created by the IBM Common Services. By default, the IBM Cloud Pak foundational services namespace is ibm-common-services, so you can find the IBM Cloud Pak foundational services URL by typing this command:

  oc get routes -n ibm-common-services

The command returns the following output:

  NAME          HOST/PORT       PATH         SERVICES        PORT    TERMINATION            WILDCARD
  <cp-console>  <foundational services url>  <service name>  https   reencrypt/Redirect     None
  <cp-proxy>    <proxy URL>                  <service name>  https   passthrough/Redirect   None

What to do next