Submitting a Spark application with GPU RDD
You can submit an application to the instance group that uses GPU Resilient Distributed Dataset (RDD), which supports adaptive GPU scheduling.
Before you begin
- You must be a cluster or consumer administrator, consumer user, or have the Spark Applications Submit permission to submit Spark applications to an instance group.
- You must have started the instance group that allocates GPUs for its applications. See Starting instance groupsStarting Spark instance groups.
- You must have a Spark application to submit to the GPU instance group.
- You can use either a Python, Scala, or R API that the Spark application uses to create a new GPU RDD. For more information on the APIs, see GPU RDD sample and API examples. For a Python sample on IBM® Cloud, see conductor-gpu-sample.
- Alternatively, copy the wordcount_gpu.py sample, which uses the Python RDD API to
create a new RDD whose tasks run on GPU slots. It is recommended that you use this sample when your
cluster is installed to a shared file system, such as IBM Spectrum
Scale. When you use the
wordcount_gpu.py sample, complete these steps:
- Save the sample to the mounted file system, for example: /gpfs/conductorFS.
- Create a sample_input subdirectory to save input data in the file system, for example: /gpfs/conductorFS/sample_input.
Note: If you enabled adaptive scheduling, the SPARK_EGO_WORKLOAD_TYPE environment variable is set internally when the task is run to indicate workload type (either GPU or CPU). You can define different logic for GPU and CPU processing in the application task logic. For example:def feature_extractor(path): if (os.environ.has_key("SPARK_EGO_WORKLOAD_TYPE”)) and (os.environ[‘SPARK_EGO_WORKLOAD_TYPE’] == ‘GPU’): feature = runGPULogical() else: feature = runCPULogical() return feature sc.parallelize(...).gpu().map(lambda path: feature_extractor(path)).collect()
About this task
You can write a Spark application that uses a Resilient Distributed Dataset (RDD) API, either in Python, Scala, or R to create a new GPU RDD; whose tasks run on GPU resources in your cluster.
When you submit a Spark application as a batch application to the instance group, you can configure the following
parameters and samples:
- spark.ego.gpu.mode Spark parameter (or the SPARK_EGO_GPU_MODE environment variable): Specifies either an exclusive or default GPU mode so that the Spark executor can be started on the corresponding GPU that has the mode you request.
- Submit the wordcount_gpu.py sample.
Procedure
Results
What to do next
You can drill down from the Spark master web UI to monitor task details. If you enabled adaptive scheduling, additionally, you can use the Workload Type column in the task list to check whether a task is running on a GPU or CPU host.