watsonx.data Spark engine
watsonx.data Spark engine is one of the native Spark engines in IBM watsonx.data. It is a powerful engine capable of processing large scale data, transforming data, and executing analytical workloads.
You can use watsonx.data Spark engine to perform big data analytics seamlessly. With native Spark engine, you can fully manage Spark engine configuration, environment variables, parameters, manage access to Spark engines and run applications that involves complex analytical operations by using watsonx.data UI and REST API endpoints.
To provision a watsonx.data Spark engine, see Provisioning.
- Required permissions
- To create watsonx.data Spark engine, you must have the Admin role.
Supported Spark versions
| Name | Status | Release date | End-of-support date | Supported languages |
|---|---|---|---|---|
| Apache Spark 3.4.4 | Deprecated | JUNE 2023 | JUNE 2026 |
Python 3.11 Scala 2.12 |
| Apache Spark 3.5.4 | Supported | FEB 2025 | FEB 2028 |
Python 3.11 Scala 2.12 |
| Apache Spark 4.0.0 | Supported | AUG 2025 | AUG 2028 |
Python 3.11 Scala 2.13 |
-
Payload for submitting a Spark runtime with Python 3.11:
{"application_details":{"application":"<your application_file_path>","arguments":["<your_application_arguments>"],"conf":{"spark.app.name":"MyRuntime","spark.eventLog.enabled":"true"},"env":{"RUNTIME_PYTHON_ENV":"python311"}}} -
Payload for submitting a Spark Scala runtime:
{"application_details":{"application":"/opt/ibm/spark/examples/jars/spark-examples*.jar","arguments":["1"],"class":"org.apache.spark.examples.SparkPi","conf":{"spark.app.name":"MyRuntime","spark.eventLog.enabled":"true","spark.driver.memory":"4G","spark.driver.cores":1,"spark.executor.memory":"4G","spark.executor.cores":1,"ae.spark.executor.count":1},"env":{"SAMPLE_ENV_KEY":"SAMPLE_VALUE"}}}