Jupyter notebook environment variables
For built-in Jupyter notebooks, these environment variables are supported.
Precedence of environment variable values
- A notebook user can define environment variables for their own notebooks. An administrator can also do this by configuring the notebook for the user.
- An administrator can define environment variables when enabling notebooks for an instance group.
- A cluster administrator can define environment variables when defining a notebook package.
Environment variables | Description |
---|---|
CONDA_ENV_NAME | Allows notebook users to specify a different conda environment than the default
value. Valid values: the name of your conda environment. |
JEG_LOG_LEVEL | Specifies the log level for the Jupyter Enterprise Gateway server (string). Valid values: 0, 10, 20, 30, 40, 50, DEBUG, INFO, WARN, ERROR, or CRITICAL. Default: INFO |
JEG_STARTUP_TIMEOUT | Specifies the amount of time, in seconds, for the Jupyter start script to wait after
launching the Jupyter Enterprise Gateway process, to determine successful or failed startup. This
setting is useful as the startup time might be slower on some hosts. If this value is not specified,
the default time of 5 seconds is used. Valid values: any positive integer. Default: 5 |
JUPYTER_IP_BLOCKLIST | A comma-separated list of local IPv4 addresses (or regular expressions) that are not to be
used when determining the response address that is used to convey connection information back to the
Jupyter Enterprise Gateway from a remote kernel. In some cases, other network interfaces (for
example Docker with 172.17.0.*) can interfere, which leads to connection failures during kernel
startup. Valid values: a single IPv4 address or a comma-separated list of IPv4 addresses. Both entries can contain a wildcard character. Example: 172.17.0.*,192.168.0.27, which eliminates the use of all addresses that start with 172.17.0, as well as the single IPv4 address 192.168.0.27. |
JUPYTER_CULL_BUSY | Specifies whether to
cull a kernel when it is busy, regardless of whether it is running cells currently or not. Specify
True to kill the kernel. Generally, if you enable to cull busy kernels, you
will want to wait a long time before culling (for example, to allow for killing kernels that are
running too long). Valid values: True or False. Default: False |
JUPYTER_CULL_CONNECTED | Specifies whether to
cull a kernel if a connection to the kernel exists. This option allows you to kill kernels where an
active browser is connected to them. Specify True to cull the kernel. Valid
values are True or False. By default this is set to False, so that if you leave your Jupyter browser window open, then the notebooks won't be culled (assuming that your machine with the open browser maintains an active internet connection). Otherwise, set this to True to cull kernels that have an active browser window open. Valid values: True or False. Default: False |
JUPYTER_CULL_IDLE_TIMEOUT | Specifies the amount of
time, in seconds, to wait in before culling a kernel. Specify a positive integer. The minimum
allowed value is 3600 seconds (or one hour). To disable culling, set this
environment variable to 0. Valid values: any positive integer greater than 1. The minimum allowed value is 1. Default: 3600 |
JUPYTER_CULL_INTERVAL | Specifies the time
interval, in seconds, to query the JUPYTER_CULL_IDLE_TIMEOUT environment variable value. Use this
setting to determine how often to check if a kernel should be culled. Specify a positive integer. By
default, this interval is set to 600 seconds (or 10 minutes). Note that when
a kernel is culled, the entire application is stopped. When you return to your notebook and relaunch
the file and kernel, it will be new; anything saved will need to be rerun. Valid values: any positive integer. Default: 600 |
JUPYTER_ENV_ALLOWLIST | Specifies a list of environment variables to pass from the kernel launching process into the
kernel itself. Specify this value by using a comma-delimited list of environment variable names
wrapped in single quotation marks, such as 'VAR1','VAR2'. The variables
themselves can be either specified by the notebook user, or can be environment variables that are
automatically set when the notebook user logs into the operating system. When specified, the
notebook user will have access to these environment variables in the Jupyter GUI cells, whereas they
would otherwise only be available in Jupyter terminals. Valid values: a comma-separated list of environment variables. Example: 'VAR1','VAR2' |
JUPYTER_KERNEL_START_TIMEOUT | Specifies the amount of
time, in seconds, until the kernel times out. Specify a positive integer. Valid values: any positive integer. Default: 300 |
JUPYTER_REQUEST_TIMEOUT | Specifies the amount of
time, in seconds, to wait until the kernel errors. Specify a positive integer. Valid values: any positive integer. Default: 400 |
JUPYTER_SPARK_OPTS | Specifies additional Spark parameters (such as priority) to be used in either the Spark
submit command if you enable a notebook for a instance group, or in the kernel startup if you
configure a notebook package. Example: JUPYTER_SPARK_OPTS = "--conf spark.ego.priority=3000", which specifies that after starting the kernel, the notebook application has priory of 3000 instead of the default 5000. |
JUPYTER_USER_SPARK_OPTS | Specifies additional notebook user parameters (such as your principal and the
location of your keytab file for your notebook. The system can then use this information when
starting the Kerberos authenticated notebook using a service-level impersonation user). Notebook users can add this environment variable to the notebooks that they own, and the notebook user's value takes precedence if the same environment variable name is defined elsewhere. This
environment variable uses the same format as the JUPYTER_SPARK_OPTS parameter. For example:
such
as: which
specifies that the system can start the Kerberos authenticated notebook using a service-level
impersonation user. |
JUPYTERLAB_ENABLED | Toggles to use the JupyterLab web based interface instead of the default Jupyter notebook
interface. Valid values: true or false. |
NOTEBOOK_EXTRA_CONF_FILE | The path to an extra configuration file that runs a notebook at start time. The path can
define extra customization to the environment in which the notebook starts. Tip: If you
want to customize environment variables with the script defined in the path of the
NOTEBOOK_EXTRA_CONF_FILE, consider the environment variables that might already have values as part
of the process environment in which the service runs. You might want to append the existing
environment variables where applicable, rather than completely overwrite them. For example, if your
instance group contains data connectors, the notebook service automatically have a value for the
JUPYTER_SPARK_OPTS environment variable that contains configuration for the data connectors.
To see the current environment variable list for your notebook services:
|
NOTEBOOK_SPARK_PYSPARK_PYTHON | Specifies the path to the Python executable for the notebook. The built-in Jupyter start script consumes this NOTEBOOK_SPARK_PYSPARK_PYTHON environment variable. For custom notebooks, use this environment variable name in your start scripts. For Dockerized notebooks (Adding Dockerized notebooks), if the default value is not used and you want to run notebook applications in client mode, manually add a data volume to your notebook configuration to mount the non-default path. For notebooks using Anaconda or Miniconda, this environment variable is automatically set to the path to your Python location in the conda environment bin directory for the notebook. If you are using Python that is not located inside the conda environment bin directory, set the NOTEBOOK_SPARK_PYSPARK_PYTHON environment variable in the Conda environment field when configuring the notebook for your instance group (Enabling notebooks for an instance group). |
NOTEBOOK_SPARK_R_COMMAND | Specifies the path to the executable for running R scripts in the notebook. The built-in Jupyter start script consumes this NOTEBOOK_SPARK_R_COMMAND environment variable. For custom notebooks, use this environment variable names in your start scripts. For Dockerized notebooks (Adding Dockerized notebooks), if the default value is not used and you want to run notebook applications in client mode, manually add a data volume to your notebook configuration to mount the non-default path. For notebooks using Anaconda or Miniconda, this environment variable is automatically set to the path to your R script location in the conda environment bin directory for the notebook. If you are using R script that is not located inside the conda environment bin directory, set this NOTEBOOK_SPARK_R_COMMAND environment variable in the Conda environment field when configuring the notebook for your instance group (Enabling notebooks for an instance group). |
SITE_PACKAGE_PATH | Speeds up the process of the Jupyter start script. To use this setting, indicate the path to
the notebook user's conda environment's site-packages directory directly, rather than have the start
script run a find command to locate it. This environment variable is useful on file systems where the find command can take a very long time (such as in very large conda environments, or in slower file systems). Use this setting when the path to the site-packages directory is known; otherwise, use the SITE_PACKAGE_FIND_DEPTH environment variable. |
SITE_PACKAGE_FIND_DEPTH | Speeds up the process of the Jupyter start script. To use this setting, indicate the maximum
depth for the find command when searching for the conda environment's
site-packages directory. This environment variable is useful on file systems where the find command can take a very long time (such as in very large conda environments, or in slower file systems). Use this setting when the path to the site-packages directory is unknown; otherwise, use the SITE_PACKAGE_PATH environment variable. |