Managing predictive deployments

For proper deployment, you must set up a deployment space and then select and configure a specific deployment type. After you deploy assets, you can manage and update them to make sure they perform well and to monitor their accuracy.

To be able to deploy assets from a space, you must install and provision the Watson Machine Learning service.

Service
The Watson Machine Learning service is not available by default. An administrator must install this service on the IBM Cloud Pak for Data platform. To determine whether the service is installed, open the Services catalog and check whether the Watson Machine Learning service is enabled.

Online and batch deployments provide simple ways to create an online scoring endpoint or do batch scoring with your models.

If you want to implement a custom logic:

  • Create a Python function to use for creating your online endpoint
  • Write a notebook or script for batch scoring
Note: If you create a notebook or a script to perform batch scoring such an asset runs as a platform job, not as a batch deployment.

Deployable assets

Following is the list of assets that you can deploy from a Watson Machine Learning space, with information on applicable deployment types:

List of assets that you can deploy
Asset type Batch deployment Online deployment App deployment
Functions Yes Yes No
Models Yes Yes No
Scripts Yes No No
Shiny apps No No Yes

An R Shiny app is the only asset type that is supported for web app deployments.

Notes:

  • A deployment job is a way of running a batch deployment, or a self-contained asset like a code package or flow in Watson Machine Learning. You can select the input and output for your job and choose to run it manually or on a schedule. For more information, see Creating a deployment job.
  • You can deploy a Natural Language Processing model by using Python functions or Python scripts. Both online and batch deployments are supported.
  • Notebooks and flows use notebook environments. You can run them in a deployment space, but they are not deployable.
  • If you save an AutoAI experiment as a notebook in your Project, then promote the notebook from your deployment space, your notebook job might fail. This can happen if the runtime environment that is selected to run the deployment job for the notebook contains less number of resources than the runtime environment originally used to run the AutoAI experiment. To avoid failure, your must promote the notebook and the environment separately to your deployment space.
  • You can use automatic mounts only for storage volumes with Watson Machine Learning Shiny app deployments and notebook runtimes. You cannot use automatic mounts for storage volumes with online and batch deployments because they are not supported by Watson Machine Learning.

For more information, see:

After you deploy assets, you can manage and update them to make sure they perform well and to monitor their accuracy. Some ways to manage or update a deployment are as follows:

Configuring API gateways to provide stable endpoints

Watson Machine Learning provides stable endpoints to prevent downtime. However, you might experience downtime if you move to a new Cloud Pak for Data instance or add an instance.

API gateways provide a stable URL that can be used with your Watson Machine Learning API endpoint. You can use an API gateway (available in Cloud Pak for Integration) with your deployment endpoints to handle downtime if it happens in the following cases:

  • If you have more than one instance of Cloud Pak for Data in a high-availability configuration, and one of the available instances fails. In this case, you can use an API gateway for switching automatically to another instance, thereby preventing complete failure.
  • If you have more than one application that uses the same endpoint, and the deployment endpoint is not available. For example, if you accidentally delete the deployment. In this case, you can update the endpoint in the API gateway to make sure that applications continue to use it.

Enabling GPU and MIG support for deployment runtimes

If you are deploying a predictive machine learning model that requires significant processing power for inferencing, you can optionally configure a GPU for deployment runtimes.

You can also enable MIG support for GPUs when you want to deploy an application that does not require the full power of an enitre GPU. If you are configuring MIG for GPU-accelerated workloads, all GPU-enabled nodes should adhere to a single strategy determined in the prior configuration steps. This ensures consistent behaviour across all GPU-enabled nodes in the cluster. To configure MIG support, see Nvidia Guide for configuring MIG support.

Learn more

Parent topic: Deploying assets