Kernel culling for Jupyter notebooks
When you assign a Jupyter notebook to a user and start it, that notebook will stay running until it is stopped or until the cluster needs to rebalance. Within Jupyter, you can start multiple kernels that can take additional resources: the Python kernel does not take any additional resources and runs on the same host where the Jupyter notebook runs; Spark Python (Spark Cluster Mode) kernels will take additional resources as the kernel itself is launched as a Spark driver in the cluster, and as the kernel runs, it will spawn executors taking additional slots as required. The managing of these executors is standard throughout the Spark implementation. The kernel (driver), is managed differently: you can have the kernels in your Jupyter notebook culled (killed) at certain intervals.
Kernel culling is configured by default for Jupyter 5.4.0 notebooks (or later) from the Notebook Management page of the cluster management console to reclaim kernel resources.
- When the notebook service is stopped (all kernels shut down in the current notebook server).
- When the kernel is manually stopped and the application is killed from the cluster management console or Spark UI.
- When the kernel is disciplined by the EGO resource manager (enforced reclaim by the resource orchestrator).
When you create notebook packages, you can specify timeout environment variables for the Jupyter kernel so that the kernel does not idle and resources are reclaimed within a specific time frame. For how to create packages, see Creating notebook packages.
- JUPYTER_CULL_BUSY
- JUPYTER_CULL_CONNECTED
- JUPYTER_CULL_IDLE_TIMEOUT
- JUPYTER_CULL_INTERVAL
- JUPYTER_KERNEL_START_TIMEOUT
- JUPYTER_REQUEST_TIMEOUT