VS Code development environment - Spark labs
The VS Code development environment is a Spark-based development environment that enables you to interactively program, debug, submit, and test Spark applications on a Spark cluster running on the Spark engine.
It is available as a Visual Studio Code extension and you can install it in your local system to access Spark IDE using Visual Studio Code. It reduces the time for development and increases usability.
Before you begin
- Install a desktop version of Visual Studio Code.
- Install watsonx.data extension from VS Code Marketplace.
- Install the extension
Remote - SSH
from Visual Studio Code marketplace.
About this task
- Setting up the Spark labs
-
- Open Visual Studio Code. You view the watsonx.data icon in the left navigation window. Click the icon. The Welcome to IBM watsonx.data extension window opens.
- Click Manage Connection. The Manage Connection watsonx.data window opens.
- Configure one of the following details:
- JSON Inputs
- Form Inputs
- To configure JSON Inputs, click JSON Inputs and
specify the following details:
- API Key : Provide the platform API key. To generate the API key, see Platform API key.
- Connection JSON : Provide the connection details from the watsonx.data user interface. To do that:
- Log in to your watsonx.data page.
- From the navigation menu, click Connection Information.
- Click VS Code. Copy the configuration from the VS Code connection configuration field and use this as the Connection JSON field value. For more information, see Getting connection information.
- To configure Form Inputs, click Form Inputs and
specify the following details:
- Host address of watsonx.data console : Provide the host IP address of watsonx.data. To retrieve the host IP address, see Getting connection information.
- Environment Type : Select Software.
- Username : The watsonx.data login username.
- API Key : Provide the platform API key. To generate the API key, see Platform API key.
- Click Test & Save.
Connected to watsonx.data
message is displayed. - Create a Spark lab.
- To create a new Spark lab, click the
+
icon. The Create Spark Lab window opens. Specify a unique name for the Spark lab and select the Spark Version. The default Spark version is3.4
. You can modify the other optional fields if required.Note: Thespark.hadoop.wxd.apikey
parameter is configured in the Spark configurations field by default while creating Spark lab. - Click Create. Click Refresh to see the Spark lab in the left window. This is the dedicated Spark cluster for application development.
- Open the Spark lab to access the file system, terminal, and work with it.
- In the Explorer window, you can view the file system, where you can upload the files, and view logs.
Note: To delete an already running Spark lab, hover the mouse over the name of the Spark lab in the watsonx.data left navigation pane and click on Delete icon. - To create a new Spark lab, click the
- Developing a Spark application
- Develop a Spark application in the Spark lab. You can work with a Spark application in one of
the following ways:
- Create your own Python file
-
- Create, upload or drag the Python application file to the Explorer window. The file opens in the right pane of Visual Studio Code application.
- Run the following command in the terminal. This initiates a Python session and you can see the
acknowledgment message in the terminal.
python <filename>
- Create Jupyter Notebooks
-
-
Browse for the
Jupyter
extension from the VS Code Marketplace and install the extension. -
You can either create a new Jupyter Notebook file with the extension
.ipynb
or drag and drop the existing notebook to the Explorer window. -
From the Explorer window, double-click to open the Jupyter Notebook.
-
From the Jupyter Notebook, click the Change Kernel link to select the Python Environment.
-
The Jupyter Notebook is now ready to use. You can write your code and execute it cell by cell.
-