Spark service CLI

The Analytics Engine powered by Apache Spark CLI provides command line options to interact with instances. You can manage instances and Spark applications using the CLI.

Before you get started

Before you can get started using the Apache Spark CLI, you need to define the following environment variable because the analytics-engine namespace is hidden behind a feature flag.

You must define the CPDCTL_ENABLE_ANALYTICS_ENGINE environment variable as follows:

CPDCTL_ENABLE_ANALYTICS_ENGINE=1

CLI help command

The Spark service CLI help command shows you the supported CLI support commands for:

cpdctl analytics-engine --help

For information about a particular command, use:

cpdctl analytics-engine [command] --help

Instance management commands

The instance management help command shows you the supported instance management CLI commands for:

cpdctl analytics-engine instance  --help

For information about a particular instance management command, use:

cpdctl analytics-engine instance [command] --help

instance get

Use this command to get the instance details, for example the instance home volume, the available resource quota and other configurations of a provisioned instance. For help on the command syntax, enter:

cpdctl analytics-engine instance get --help

Example of using the instance get command

cpdctl analytics-engine instance get --instance-id 62f8f5de-6c56-499a-a01a-744c6e16caa1 --output json

{
  "configs": {},
  "context_id": "d57ea5e1-fbca-44ea-b72a-bb63ebecae9c",
  "context_type": "space",
  "home_volume": "volumes-silpi-test-vol-pvc",
  "instance_id": "62f8f5de-6c56-499a-a01a-744c6e16caa1",
  "resource_quota": {
    "avail_cpu_quota": 64,
    "avail_memory_quota_gibibytes": 200,
    "cpu_quota": 64,
    "memory_quota_gibibytes": 200
  }
}

instance set-quota

Use this command to set the instance quota (the CPU and memory quota).

cpdctl analytics-engine instance set-quota --help

Example of using the instance set-quota command

cpdctl analytics-engine instance set-quota --instance-id 62f8f5de-6c56-499a-a01a-744c6e16caa1 --cpu-quota 64 --memory-quota 200
...
OK

Spark application commands

The Spark application help command shows you the supported Spark application CLI commands for:

  • Submitting Spark applications
  • Stopping Spark applications
  • Getting the details of an application by application ID
cpdctl analytics-engine spark-app --help

For information about a particular spark-app command, use:

cpdctl analytics-engine spark-app [command] --help

With the Spark application CLI commands, you can perform the following operations:

spark-app submit

Use this command to submit a Spark application in an instance. For help on the command syntax, enter:

cpdctl ae spark-app submit --help

Example of using the spark-app submit command:

cpdctl ae spark-app submit --instance-id 62f8f5de-6c56-499a-a01a-744c6e16caa1 --name "/opt/ibm/spark/examples/src/main/python/wordcount.py" --arguments "/opt/ibm/spark/examples/src/main/resources/people.txt" --output json
{
"application_id": "03ae6297-d6a6-4032-adc6-861ead5f3ad2",
"spark_application_id": "app-20220407090305-0000",
"start_time": "Thursday 07 April 2022 09:03:05.686+0000",
"state": "WAITING"
}

spark-app get

Use this command to show the details of a submitted Spark application in an instance. For help on the command syntax, enter:

cpdctl analytics-engine spark-app get --help

Example of using the spark-app get command

The following example shows the details of a submitted Spark application:

cpdctl analytics-engine spark-app get --instance-id 62f8f5de-6c56-499a-a01a-744c6e16caa1 --application-id 4da21012-837b-4d61-b8c8-44140a4da956 --output json
{
  "application_details": {},
  "application_id": "4da21012-837b-4d61-b8c8-44140a4da956",
  "finish_time": "Tuesday 05 April 2022 16:35:42.355+0000",
  "mode": "stand-alone",
  "spark_application_id": "app-20220405163526-0000",
  "start_time": "Tuesday 05 April 2022 16:35:26.911+0000",
  "state": "FINISHED"
}

spark-app stop

Use this command to stop a submitted Spark application in an instance. For help on the command syntax, enter:

cpdctl analytics-engine spark-app stop --help

Example of using the spark-app stop command:

The following example shows how to stop a Spark application:

cpdctl analytics-engine spark-app stop --instance-id 62f8f5de-6c56-499a-a01a-744c6e16caa1 --application-id 63acf863-1aea-4aa8-a93d-08e30112fae9 --output json
""

Spark history server commands

The Spark history server help command shows you that the CLI supports commands for:

cpdctl analytics-engine history-server --help

For information about a particular spark-history command, use:

cpdctl analytics-engine history-server [command] --help

history-server start

Use this command to start the Spark history server. For help on the command syntax, enter:

cpdctl analytics-engine history-server start --help

Example of using the spark-server start command

The following example shows how to start the Spark history server:

cpdctl analytics-engine history-server start --instance-id 62f8f5de-6c56-499a-a01a-744c6e16caa1 --output json
{
  "message": "History server started successfully"
}

history-server stop

Use this command to stop the Spark history server. For help on the command syntax, enter:

cpdctl analytics-engine history-server stop --help

Example of using the spark-server stop command

The following example shows how to stop the Spark history server:

cpdctl analytics-engine history-server stop --instance-id 62f8f5de-6c56-499a-a01a-744c6e16caa1 --output json
{
  "message": "History stopped successfully"
}