Setting custom configurations using API

When you submit Spark runtime, you can specify the custom configurations to be applied to Spark runtime at engine and runtime level.

Applies to :

Spark engine

Apache Gluten accelerated Spark engine

Specifying custom configs at the engine-level

Required permissions
You must have the Admin role.
curl -k --request PATCH <INSTANCE_ENDPOINT>/default_configs  -H "Authorization: Bearer <ACCESS_TOKEN>" --data-raw '{ "spark.ui.requestheadersize": "14k", "spark.eventLog.enabled": "true"}'

}

Specifying custom configs at the runtime-level

Required permissions
You must have the User role.
curl -k -X POST <KERNEL_ENDPOINT> -H "Authorization: Bearer <ACCESS_TOKEN>" -d '{"name":"python310", "engine": {"conf": {"spark.ui.requestheadersize":"16k", "spark.eventLog.enabled": "true"}}}'
curl -k -X POST --url https://<cpd_host_name>/lakehouse/api/<api_version>/spark_engines/<spark_engine_id>/applications \ -H "Authorization: Bearer <ACCESS_TOKEN>" -d @input.json
An example payload:
{
	"application_details": {
		"application": "/opt/ibm/spark/examples/src/main/python/wordcount.py",
		"arguments": ["/opt/ibm/spark/examples/src/main/resources/people.txt"],
		"conf": {
			"spark.app.name": "MyJob",
			"spark.eventLog.enabled": "true",
			"spark.ui.requestheadersize":"16k"
		}
	}
}

In addition to the engine-level and runtime-level configurations, you can also select the nodes in the Kubernetes cluster(to deploy the Spark runtimes) by using the tolerations, nodeselector and topologySpreadConstraints.

Specify the properties in Base64 encoded format.

Example:

The following example shows the properties that are defined by using topologySpreadConstraints.

[
  {
    "maxSkew": 1,
    "topologyKey": "mynode",
    "whenUnsatisfiable": "DoNotSchedule"
  }
]
"ae.kubernetes.spec.topologySpreadConstraints": "WwogIHsKICAgICJtYXhTa2V3IjogMSwKICAgICJ0b3BvbG9neUtleSI6ICJteW5vZGUiLAogICAgIndoZW5VbnNhdGlzZmlhYmxlIjogIkRvTm90U2NoZWR1bGUiCiAgfQpdCg=="

The following example displays a comprehensive method of defining all the three configurations namely tolerations, nodeselector and topologySpreadConstraints. All the values are encoded using Base64 format.

"ae.kubernetes.spec.tolerations": "W3sKCiAgImtleSI6ICJrZXkxIiwKCiAgIm9wZXJhdG9yIjogIkVxdWFsIiwKCiAgInZhbHVlIjogInZhbHVlMSIsCgogICJlZmZlY3QiOiAiTm9FeGVjdXRlIiwKCiAgInRvbGVyYXRpb25TZWNvbmRzIjogOTAwCgp9LCB7CgogICJrZXkiOiAia2V5MTIiLAoKICAib3BlcmF0b3IiOiAiRXF1YWwiLAoKICAidmFsdWUiOiAidmFsdWUyIiwKCiAgImVmZmVjdCI6ICJOb0V4ZWN1dGUiLAoKICAidG9sZXJhdGlvblNlY29uZHMiOiA5MDAKCn1dCg==",
"ae.kubernetes.spec.nodeSelector": "bXlub2RlOiBzcGFyawo=",
"ae.kubernetes.spec.topologySpreadConstraints": "WwogIHsKICAgICJtYXhTa2V3IjogMSwKICAgICJ0b3BvbG9neUtleSI6ICJteW5vZGUiLAogICAgIndoZW5VbnNhdGlzZmlhYmxlIjogIkRvTm90U2NoZWR1bGUiCiAgfQpdCg=="

The configurations set by the administrator at the service level as immutable. You cannot modify the configuration at the engine level or at the time of submitting runtime requests. Even if you pass the configuration, they are ignored.