Accessing Spark runtime driver and executor logs

You can access and view the Spark runtime driver logs for watsonx.data Spark.

Applies to :

Spark engine

Apache Gluten accelerated Spark engine

Before you begin

Before you can monitor Spark applications, ensure that you have completed the following prerequisites:
Required permissions
To debug Spark runtime, you must be the User of engine or user of the storage volume associated with the engine.

Procedure

The method you use depends on the watsonx.data Spark configuration:

Downloading the driver logs persisted in storage

If the Spark advanced features are not enabled for your service instance, you can only view the Spark runtime driver logs by downloading them from storage.

To download the Spark runtime driver logs for debugging purposes if you do not have the advanced features enabled:

  1. Get the engine id from engine details page. See Getting connection information.
  2. Get or generate the access token. See Generating an API authorization token. Export the token in a variable:
    export TOKEN=<token_generated>
    
  3. Set the instance ID and volume name:

    export cluster_url=<platform_instance_route>
    
    export VOLUME_NAME=<replace with name of the engine home volume>
    
    export ENGINE=<replace with Engine ID>
    
    export APPLICATION_ID=<replace with application_id>
    
    export TOKEN=<replace with bearer token>
  4. Start the file server:

    curl -k -X POST https://${CloudPakforData_URL}/zen-data/v1/volumes/volume_services/${Volume_name} -H "Authorization: ZenApiKey ${TOKEN}" -H 'Content-Type: application/json' -d '{}'

    If you receive a 409 error, this means the file server has already started. You can ignore this error and proceed to the next step.

  5. Download the log file. The log file is stored in the path: <Instance_id>/<Application_id>/logs/spark-driver-<Application_id>-stdout.

    curl -k -X GET "https://$cluster_url/zen-volumes/{$VOLUME_NAME}/v1/volumes/files/spark%2F{$ENGINE}%2F{$APPLICATION_ID}%2Flogs%2Fspark-driver-{$APPLICATION_ID}-stdout" -H "Authorization: Bearer $TOKEN"
  6. Stop the file server:

    curl -k -X DELETE https://${CloudPakforData_URL}/zen-data/v1/volumes/volume_services/${Volume_name} -H "Authorization: ZenApiKey ${TOKEN}" -H 'Content-Type: application/json' -d '{}'

Refer to Managing persistent volume instances with the Volumes API for more information on working with stored files.

Downloading logs with Spark advanced features enabled

If the Spark advanced features are enabled for your instance you can view or download the Spark runtime driver logs for debugging purposes. For details on enabling Spark advanced features, see Using advanced features.

You can download the logs in two ways:

  • Through the IBM Cloud Pak for Data web client

    1. From the navigation menu , click Services > Instances, find the instance and click it to view the instance details.
    2. Click on the right of the instance details page and select Deployment Space to open the deployment space on the Runtimes tab where you can view the Spark runtimes.
    3. Click the runtime to see the runtime runs. Performance metrics, partitions, and execution plans of the completed runtimes can be viewed on the Spark history server. See accessing the Spark history server.
    4. Click a runtime run to view the run details and log tail. You can download the complete log for the run by clicking Download log.

Spark

  • Through the REST API

    1. Get the name of the deployment space.

      1. From the navigation menu in Cloud Pak for Data, click Services > Instances, find the instance and click it to view the instance details.
      2. Make a note of the deployment space name.
    2. Export the following values. The runtime ID is included in your runtime POST response.

      export CLOUDPAKFORDATA_URL=<CloudPakforData_URL>
      export SPACE_NAME=<space_name>
      export APPLICATIONID=<application_id>
      
    3. Get the access token for the service instance. See Generating an API authorization token.

    4. Export the access token in a variable:

      export TOKEN=<ACCESS_TOKEN>
      
    5. Run the following. When using the v2 API, set the <api_version> parameter to v2; for the v3 API, set it to v3.

      SPACE_ID=$(curl -k -X GET https://$CLOUDPAKFORDATA_URL/<api_version>/spaces?name=$SPACE_NAME -H 'content-type:application/json' -H "Authorization: ZenApiKey ${TOKEN}" | python3 -c "import json, sys; print(json.load(sys.stdin)['resources'][0]['metadata']['id'])")
      
      curl -ivk https://$CLOUDPAKFORDATA_URL/<api_version>/asset_files/runtimes%2Fspark%2F$APPLICATIONID%2Flogs%2Fspark-driver-$APPLICATIONID-stdout?space_id=$SPACE_ID -H "accept: application/json" -H "Authorization: ZenApiKey ${TOKEN}"

View Spark application logs

export cluster_url=<platform_instance_route>
export VOLUME_NAME=<replace with name of the engine home volume>
export ENGINE=<replace with Engine ID>
export APPLICATION_ID=<replace with application_id>
export TOKEN=<replace with bearer token>

curl -k -X GET "https://$cluster_url/zen-volumes/{$VOLUME_NAME}/v1/volumes/files/spark%2F{$ENGINE}%2F{$APPLICATION_ID}%2Flogs%2Fspark-driver-{$APPLICATION_ID}-stdout" -H "Authorization: Bearer $TOKEN"

View executor logs

By default executor logs are not persisted. To persist them, set the environment variable in the application payload which shares the executor logs to instance home volume and you can download logs from there.
"env": {
"SPARK_WORKER_DIR": "/home/spark/shared/logs/executors"
}