Table of contents

Supported application languages and versions

Analytic Engine Powered by Apache Spark supports different languages like Python, R and Scala. Two version of Spark are currently supported, namely Spark 2.4 and Spark 3.0.

The following template IDs exist for the different languages and Spark versions:

Spark version/language Template ID
Spark 2.4 / Python 3.7 or Python 3.8 spark-2.4.0-jaas-v2-cp4d-template
Spark 3.0 / Python 3.7 or Python 3.8 spark-3.0.0-jaas-v2-cp4d-template
Spark 2.4 / Scala 2.11 spark-2.4.0-jaas-v2-cp4d-template
Spark 3.0 / Scala 2.12 spark-3.0.0-jaas-v2-cp4d-template
Spark 2.4 / R 3.6 spark-2.4.0-jaas-v2-cp4d-template
Spark 3.0 / R 3.6 spark-3.0.0-jaas-v2-cp4d-template

The following examples show you sample payloads for submitting Spark job for different languages and Spark versions. Insert the appropriate template ID for the language and Spark version you need.

  • Payload for submitting a Spark job with Python 3.7:

      {
          "template_id": "<template_id>",
          "application_details": {
              "application": "/opt/ibm/spark/examples/src/main/python/wordcount.py",
              "application_arguments": ["/opt/ibm/spark/examples/src/main/resources/people.txt"],
              "conf": {
                  "spark.app.name": "MyJob",
                  "spark.eventLog.enabled": "true"
                  },
              "env": {
                  "SAMPLE_ENV_KEY": "SAMPLE_VALUE"
                  },
              "driver-memory": "4G",
              "driver-cores": 1,
              "executor-memory": "4G",
              "executor-cores": 1,
              "num-executors": 1
              }
      }
    
  • Payload for submitting a Spark job with Python 3.8:

      {
          "template_id": "<template_id>",
          "application_details": {
              "application": "/myapp/customApps/example.py",
              "application_arguments": ["<your_application_arguments>"],
              "conf": {
                  "spark.app.name": "MyJob",
                  "spark.eventLog.enabled": "true"
                  },
              "env": {
                  "RUNTIME_PYTHON_ENV": "python38",
                  "PYTHONPATH": "/myapp/pippackages:/home/spark/space/assets/data_asset:/home/spark/user_home/python-3:/cc-home/_global_/python-3:/home/spark/shared/user-libs/python:/home/spark/shared/conda/envs/python/lib/python/site-packages:/opt/ibm/conda/miniconda/lib/python/site-packages:/opt/ibm/third-party/libs/python3:/opt/ibm/image-libs/python3:/opt/ibm/image-libs/spark2/metaindexmanager.jar:/opt/ibm/image-libs/spark2/stmetaindexplugin.jar:/opt/ibm/spark/python:/opt/ibm/spark/python/lib/py4j-0.10.7-src.zip" 
                  }
          },
          "volumes": [{
              "name": "appvol",
              "mount_path": "/myapp",
          "source_sub_path": ""
          }]
      } 
    
  • Payload for submitting a Spark Scala job:

      {
          "template_id": "<template_id>",
          "application_details": {
              "application": "/opt/ibm/spark/examples/jars/spark-examples*.jar",
              "application_arguments": ["1"],
              "class": "org.apache.spark.examples.SparkPi",
              "conf": {
                  "spark.app.name": "MyJob",
                  "spark.eventLog.enabled": "true"
                  },
              "env": {
                  "SAMPLE_ENV_KEY": "SAMPLE_VALUE"
                  },
              "driver-memory": "4G",
              "driver-cores": 1,
              "executor-memory": "4G",
              "executor-cores": 1,
              "num-executors": 1
              }
      }
    
  • Payload for submitting an R 3.6 Spark job:

      {
          "template_id": "<template_id>",
          "application_details": {
              "application": "/opt/ibm/spark/examples/src/main/r/dataframe.R",            
              "class": "org.apache.spark.examples.SparkPi",
              "conf": {
                  "spark.app.name": "MyJob",
                  "spark.eventLog.enabled": "true"
                  },
              "env": {
                  "SAMPLE_ENV_KEY": "SAMPLE_VALUE"
                  },
              "driver-memory": "4G",
              "driver-cores": 1,
              "executor-memory": "4G",
              "executor-cores": 1,
              "num-executors": 1
              }
      }