Customizing Spark applications and notebooks using the user-home volume
You can persist custom Python packages to use in Spark applications and Python notebooks across Spark instances, projects and deployment spaces in the user-home volume.
Using Python custom packages in a Python 3 folder under user-home/_global_/python-3
- Connect to the OpenShift cluster:
oc login OpenShift_URL:port
- Set the context to the project where Cloud Pak for Data is deployed:
oc project PROJECT-NAME
- Start the
ibm-nginx
deployment pod in debug mode. You must use the user ID in the following code sample.oc debug deploy/ibm-nginx --as-user=1000330999
- Copy the Python packages to the
python-3
directory:oc cp <python-package.tar.gz> ibm-nginx-debug:/user-home/_global_/python-3
- Untar the package in the
ibm-nginx-debug
pod:cd /user-home/_global_/python-3 tar xvzf <python-package.tar.gz>
- The
python-3
directory is already set in PYTHONPATH. Add the following line to the top of your PySpark application to import the package:import <package_name>