Why can Jupyter notebooks with Spark 2.1.0 or higher have only one active kernel?
When you are using Jupyter 4.1.0 or Jupyter 5.0.0 notebooks with Spark version 2.1.0 or higher, only one Jupyter notebook kernel can successfully start a SparkContext.
All subsequent kernels are not able to start a SparkContext (sc). If
you try to issue Spark commands on any subsequent kernels without stopping the running kernel, you
encounter the following error: NameError: name 'sc' is not defined. This issue is
caused by the metastore_db file, which causes the error because it cannot be
duplicated under a single directory.
To run more than one Jupyter notebook kernel, enable concurrent user access to be able to run Spark SQL by using Derby for the embedded metastore. You can do this from the Data Connectors tab when you are creating a instance group. This enables multiple Jupyter notebook kernels to start a SparkContext.