Jupyter notebooks
IBM® Spectrum Conductor bundles Jupyter notebooks to provide an interactive environment for data manipulation and visualization. To get the most out of Jupyter, apply the sc-2.5-build600258 interim fix, which contains several enhancements on top of the base integration.
Spark-ready kernels
IBM Spectrum Conductor provides support for the following Spark-ready kernels:- Python
- Scala
- R
- To use the R kernel, you must set up the R environment and preinstall all of the required R packages on all driver resource groups. During instance group creation, you must also set the spark.r.command parameter with the full path to your R script location.
- Spark 3.0.1 with Jupyter notebooks does not support the R and Scala Spark cluster mode kernels. Support for this will become available when Jupyter Enterprise Gateway has a release supporting Spark 3.x with R and Scala.
Jupyter Enterprise Gateway
IBM Spectrum Conductor works with the Jupyter Enterprise Gateway, which is a web server that launches kernels on remote servers throughout the enterprise. The Enterprise Gateway launcher includes the -RemoteProcessProxy.spark-context-initialization-mode option, which indicates the time frame in which the Spark context is created. IBM Spectrum Conductor supports the eager mode, which attempts to create the Spark context as soon as possible.
Home directory for notebook workload
Some functions of the Jupyter notebook require that the notebook workload execution user have an existing home directory. Ensure this home directory exists before running Jupyter notebook workload.