Customizing the Apache Spark directory structure

IBM® Open Data Analytics for z/OS® installs Apache Spark into a z/OS file system (zFS) or hierarchical file system (HFS) directory. This documentation refers to the installation directory as SPARK_HOME. The default installation directory is /usr/lpp/IBM/izoda/spark/sparknnn, where nnn is the current Apache Spark version (for instance, /usr/lpp/IBM/izoda/spark/spark24x for Spark 2.4.8).

By default, Apache Spark runs from the installation directory, and most of its configuration files, log files, and working information are stored in the installation directory structure. On z/OS systems, however, the use of the installation directory for all of these purposes is not ideal operating behavior. Therefore, by default, Open Data Analytics for z/OS installs Apache Spark in a read-only file system. The following tasks describe how to set up customized directories for the Apache Spark configuration files, log files, and temporary work files. While you can customize the directory structure used by Apache Spark, the examples here follow the Filesystem Hierarchy Standard.

Plan to work with your system programmer who has authority to update system directories.

Note: SPARK_HOME is an environment variable that is used by many Apache Spark scripts. This variable must contain the path to the z/OS Spark installation directory. In step 2 of Setting up a user ID for use with z/OS Spark, it was determined which files needed to be updated to set thez/OS UNIX shell environment. Update the files modified in that step to set and export the SPARK_HOME environment variable. For example:
export SPARK_HOME=/usr/lpp/IBM/izoda/spark/spark24x