Getting started with Spark applications

You can run Spark applications on an Analytics Engine powered by Apache Spark instance.

To get started with Spark applications:

  1. Provision an Analytics Engine powered by Apache Spark instance. You need Administrator role in the project or in the deployment space to provision an instance. See Provisioning an instance.
  2. Manage the instance. You need Administrator role in the project to manage the resource quota and user access.
    • Adjust the resource quota for the instance to fit the requirements for the Spark jobs that will run in the instance. See Changing instance resource quota.
    • Assign users Developer role to the instance so that they can submit Spark jobs. See Managing user access.
  3. Generate an access token to use the Spark jobs API. See Generating an API authorization token.
  4. Choose how to persist your Spark application job files. See Persisting Spark applications.
  5. Run your Spark application job. See Submitting Spark jobs.
  6. View the job status. See Viewing Spark job status.
  7. View job logs. See Accessing Spark job driver logs.
  8. You can also run Spark applications interactively. See Running Spark applications interactively.
  9. Debug your applications using the Spark history server. See Accessing the Spark history server.
  10. Access data in your Spark application from storage volumes by using the IBM Cloud Pak for Data volume API. See Accessing data from storage.

Parent topic: Apache Spark