Jobs in an analytics project (Watson Studio)

A job is a way of running operational assets, such as Data Refinery flows, SPSS Modeler flows, or Notebooks in a project in Watson Studio. You can also create jobs for a promoted asset in deployment spaces.

You can create jobs for the following assets:

Data Refinery flows in projects and spaces
DataStage flows in projects
SPSS Modeler flows in projects
Jupyter Notebooks in projects
Python and R scripts in projects
Metadata import assets in projects

You can create a job in one of several ways:

When you work directly on the asset in a tool in a project:
- A Data Refinery flow in Data Refinery. See Creating jobs in Data Refinery.
- A DataStage flow in DataStage. See Creating jobs in DataStage.
- An SPSS Modeler flow in SPSS Modeler. See Creating jobs in SPSS Modeler.
- A Notebook in the Notebook editor. See Creating jobs in the Notebook editor or Notebook viewer.
At the time the asset is created in a project. See Creating a metadata import job.
From a project’s Assets page. Select the asset from the section for your asset type and choose Create job from the asset’s Actions menu.
From the Assets page in a deployment space. Select the asset and choose Create job from the asset’s Actions menu. Currently, this only works for promoted Data Refinery flows.

You can’t create jobs directly in JupyterLab or RStudio. To create jobs for Notebooks or Python scripts that are created in JupyterLab, or for R scripts that are created in RStudio, you must push the assets from the IDE to the Git repository associated with your project and then sync the repository with the project. Any Notebooks, scripts, or RShiny apps that are pushed to a GIT repository with a size of zero bytes are considered invalid and are not synced with the project. You can create jobs for synced GIT assets:

From the Notebook viewer for Notebooks. See Creating jobs in the Notebook viewer.
From the project Assets page for Notebooks and scripts. See Creating jobs for the Assets page.

From the Jobs tab of your project, you can:

See the list of the jobs in your project.
View the details of each job, including editing settings. You can also start a job manually from here. See Viewing jobs in a project.
Monitor job runs
Delete jobs

Creating jobs in Data Refinery

You can create a job to run a Data Refinery flow directly in Data Refinery.

In Data Refinery, click the Jobs icon from the Data Refinery toolbar and select Save and create a job.
Define the job details by entering a name and a description (optional).
On the Configure page, you can:
- View which input data file is used and what the output file will be called.
- Select an environment runtime for the job.
On the Schedule page, you can optionally add a one-time or repeating schedule.
Review the job settings. Then, create the job and run it immediately, or create the job and run it later.

You can track the status of a job’s run and view the logs on the job’s run description page. See Viewing jobs in a project.

Creating jobs in DataStage

In DataStage, a job is a platform runtime asset that is related and associated with a DataStage flow. Multiple jobs can be associated with the same DataStage flow. Jobs can be scheduled or run as needed.

Jobs are automatically created for you when you edit or work with a DataStage flow on the canvas. When you click Run on the canvas, a job is created and started. Jobs maintain their past runs and logs, which you can view on the jobs dashboard.

Because any number of jobs can be associated with a single DataStage flow, a DataStage flow has a one-to-many relationship with the job asset type.

To manually create a job, complete the following steps:

Open the project that contains the DataStage flow that you want to work with, then click the Assets tab.
Go to the DataStage flows section and click the vertical ellipsis icon by the DataStage flow that you want to work with. Then, click Create job.
Continue through the next configuration steps by following the job creation wizard. Then, click Create or Create and run.

Creating jobs in SPSS Modeler

You can create a job to run an SPSS Modeler flow directly in SPSS Modeler.

In SPSS Modeler, click the Jobs icon from the SPSS Modeler toolbar and select Create a job.
Define the job details by entering a name, a description (optional), and specifying job parameters if you have any that are set up in the flow’s properties.
On the Schedule page, you can optionally add a one-time or repeating schedule.
Review the job settings. Then, create the job and run it immediately, or create the job and run it later.

You can track the status of a job’s run and view the logs on the job’s run description page. See Viewing jobs in a project.

Creating jobs in the Notebook editor or viewer

You can create a job to run a Notebook directly from the Notebook editor or the Notebook viewer by clicking the Jobs icon from the Notebook’s menu bar. See Create a Notebook job.

Creating a metadata import job

A metadata import job is created at the time you create a metadata import asset. See Creating a metadata import asset and importing metadata.

Creating jobs from the Assets page

You can create jobs for Data Refinery flows, SPSS Modeler flows, Jupyter notebooks, and Python and R scripts from the Assets page of a project.

Select the asset from the section for your asset type and choose Create job from the ACTIONS menu.
Define the job details by entering a name and a description (optional).
On the Configure page, select an environment runtime for the job. Depending on the asset, you can optionally configure more settings, for example environment variables or script arguments.

Note that only custom environment definitions with Python 3.7 or 3.6 as software version can be selected when you create a job to run notebooks created in JuypterLab. Custom environment definitions with JupyterLab with Python 3.7 or 3.6 as the software version cannot be selected.
On the Schedule page, you can optionally add a one-time or repeating schedule.

If you define a start day and time without selecting Repeat, the job will run exactly one time at the specified day and time. If you define a start date and time and you select Repeat, the job will run for the first time at the timestamp indicated in the Repeat section.

You can’t change the time zone; the schedule uses your web browser’s time zone setting. If you exclude certain weekdays, the job might not run as you would expect. The reason might be due to a discrepancy between the time zone of the user who creates the schedule, and the time zone of the compute node where the job runs.
Review the job settings. Then, create the job and run it immediately, or create the job and run it later.

Viewing jobs in a tool

You can view and edit jobs associated with an asset directly in the following tools:

Data Refinery
Notebook editor or viewer
SPSS Modeler

To view and change job settings in a tool:

Click the Jobs icon from the toolbar and select Save and view jobs. This action lists the jobs that exist for the asset.
Select a job to see its details. You can change job settings by clicking Edit job.

Viewing job details

You can view all of the jobs that exist for your project from the project’s Jobs page. With Admin or Editor role for the project, you can view and edit the job details. You can run jobs manually and you can delete jobs. With Viewer role for the project, you can only view the job details.

To view the details of a specific job, click the job. From the job’s details page, you can:

View the runs for that job and the status of each run. If a run failed, you can select the run and view the log tail or download the entire log file to help you troubleshoot the run. A failed run might be related to a temporary connection or environment problem. Try running the job again. If the job still fails, you can send the log to Customer Support.

Note that only notebook output cells of type text are logged during a notebook job run. If you want to monitor the output of your notebook cells in the log file for the job run, you should use the print() command.
Edit job settings by clicking Edit job, for example to change schedule settings or to pick another environment definition.
Run the job manually by clicking from the job’s action bar. You can only run a job manually if no schedule was defined for the job.
Delete the job by clicking from the job’s action bar.