Adding custom configurations

When you submit Spark jobs and kernels, you can specify the custom configurations to be applied to Spark runtimes.

Specifying custom configs at the instance-level

curl -k --request PATCH <INSTANCE_ENDPOINT>/default_configs  -H "Authorization: Bearer <ACCESS_TOKEN>" --data-raw '{ "spark.ui.requestheadersize": "14k", "spark.eventLog.enabled": "true"}'

}

Specifying custom configs at the kernel-level:

curl -k -X POST <KERNEL_ENDPOINT> -H "Authorization: Bearer <ACCESS_TOKEN>" -d '{"name":"python310", "engine": {"conf": {"spark.ui.requestheadersize":"16k", "spark.eventLog.enabled": "true"}}}'
curl -k -X POST <V4_JOBS_API_ENDPOINT> -H "Authorization: Bearer <ACCESS_TOKEN>" -d @input.json
An example payload:
{
	"application_details": {
		"application": "/opt/ibm/spark/examples/src/main/python/wordcount.py",
		"arguments": ["/opt/ibm/spark/examples/src/main/resources/people.txt"],
		"conf": {
			"spark.app.name": "MyJob",
			"spark.eventLog.enabled": "true",
			"spark.ui.requestheadersize":"16k"
		}
	}
}

In addition to the instance-level and kernel-level configurations, you can also select the nodes in the Kubernetes cluster(to deploy the Spark runtimes) by using the tolerations, nodeselector and topologySpreadConstraints.

Specify the properties in Base64 encoded format.

Example:

The following example shows the properties that are defined by using topologySpreadConstraints.

[
  {
    "maxSkew": 1,
    "topologyKey": "mynode",
    "whenUnsatisfiable": "DoNotSchedule"
  }
]
"ae.kubernetes.spec.topologySpreadConstraints": "WwogIHsKICAgICJtYXhTa2V3IjogMSwKICAgICJ0b3BvbG9neUtleSI6ICJteW5vZGUiLAogICAgIndoZW5VbnNhdGlzZmlhYmxlIjogIkRvTm90U2NoZWR1bGUiCiAgfQpdCg=="

The following example displays a comprehensive method of defining all the three configurations namely tolerations, nodeselector and topologySpreadConstraints. All the values are encoded using Base64 format.

"ae.kubernetes.spec.tolerations": "W3sKCiAgImtleSI6ICJrZXkxIiwKCiAgIm9wZXJhdG9yIjogIkVxdWFsIiwKCiAgInZhbHVlIjogInZhbHVlMSIsCgogICJlZmZlY3QiOiAiTm9FeGVjdXRlIiwKCiAgInRvbGVyYXRpb25TZWNvbmRzIjogOTAwCgp9LCB7CgogICJrZXkiOiAia2V5MTIiLAoKICAib3BlcmF0b3IiOiAiRXF1YWwiLAoKICAidmFsdWUiOiAidmFsdWUyIiwKCiAgImVmZmVjdCI6ICJOb0V4ZWN1dGUiLAoKICAidG9sZXJhdGlvblNlY29uZHMiOiA5MDAKCn1dCg==",
"ae.kubernetes.spec.nodeSelector": "bXlub2RlOiBzcGFyawo=",
"ae.kubernetes.spec.topologySpreadConstraints": "WwogIHsKICAgICJtYXhTa2V3IjogMSwKICAgICJ0b3BvbG9neUtleSI6ICJteW5vZGUiLAogICAgIndoZW5VbnNhdGlzZmlhYmxlIjogIkRvTm90U2NoZWR1bGUiCiAgfQpdCg=="

The configurations set by the administrator at the service level as immutable. You cannot modify the configuration at the instance level or at the time of submitting kernel or job requests. Even if you pass the configuration, they are ignored.

Parent topic: Running Spark applications interactively