Deploying a tuned foundation model
Deploy a tuned model, then add it to a business workflow, and start using foundation models in a meaningful way.
Service The watsonx.ai service and other supplemental services are not available by default. An administrator must install these services on the IBM Cloud Pak for Data platform. To determine whether a service is installed, open the Services catalog and check whether the service is enabled.
Before you begin
The tuning experiment that you used to tune the foundation model must be finished. For more information, see Tuning a foundation model.
Deploy a tuned model
To deploy a tuned model, complete the following steps:
-
From the navigation menu, expand Projects, and then click All projects.
-
Click to open your project.
-
From the Assets tab, click the Experiments asset type.
-
Click to open the tuning experiment for the model you want to deploy.
-
From the Tuned models list, find the completed tuning experiment, and then click New deployment.
-
Name the tuned model.
The name of the tuning experiment is used as the tuned model name if you don't change it. The name has a number after it in parentheses, which counts the deployments. The number starts at one and is incremented by one each time you deploy this tuning experiment.
-
Optional: Add a description and tags.
-
In the Target deployment space field, choose a deployment space.
The deployment space must be associated with a machine learning instance that is in the same account as the project where the tuned model was created.
If you don't have a deployment space, choose Create a new deployment space, and then follow the steps in Creating deployment spaces.
For more information, see What is a deployment space?
-
In the Deployment serving name field, add a label for the deployment.
The serving name is used in the URL for the API endpoint that identifies your deployment. Adding a name is helpful because the human-readable name that you add replaces a long, system-generated ID that is assigned otherwise.
The serving name also abstracts the deployment from its service instance details. Applications can refer to this name which allows for the underlying service instance to be changed without impacting users.
The name can have up to 36 characters. The supported characters are [a-z,0-9,_].
The name must be unique across the IBM Cloud region. You might be prompted to change the serving name if the name you choose is already in use.
-
Tip: Select View deployment in deployment space after creating. Otherwise, you need to take more steps to find your deployed model.
-
Click Deploy.
After the tuned model is promoted to the deployment space and deployed, a copy of the tuned model is stored in your project as a model asset.
What is a deployment space?
When you create a new deployment, a tuned model is promoted to a deployment space, and then deployed. A deployment space is separate from the project where you create the asset. A deployment space is associated with the following services that it uses to deploy assets:
-
Watson Machine Learning: A product with tools and services you can use to build, train, and deploy machine learning models. This service hosts your turned model.
-
IBM Cloud Object Storage: A secure platform for storing structured and unstructured data. Your deployed model asset is stored in a Cloud Object Storage bucket that is associated with your project.
For more information, see Deployment spaces.
Testing the deployed model
The true test of your tuned model is how it responds to input that follows tuned-for patterns.
You can test the tuned model from one of the following pages:
- Prompt Lab: A tool with an intuitive user interface for prompting foundation models. You can customize the prompt parameters for each input. You can also save the prompt as a notebook so you can interact with it programmatically.
- Deployment space: Useful when you want to test your model programmatically. From the API Reference tab, you can find information about the available endpoints and code examples. You can also submit input as text and choose to return the output or in a stream, as the output is generated. However, you cannot change the prompt parameters for the input text.
To test your tuned model, complete the following steps:
-
From the navigation menu, select Deployments.
-
Click the name of the deployment space where you deployed the tuned model.
-
Click the name of your deployed model.
-
Follow the appropriate steps based on where you want to test the tuned model:
-
From Prompt Lab:
-
Click Open in Prompt Lab, and then choose the project where you want to work with the model.
Prompt Lab opens and the tuned model that you deployed is selected from the Model field.
-
In the Try section, add a prompt to the Input field that follows the prompt pattern that your tuned model is trained to recognize, and then click Generate.
For more information about how to use the prompt editor, see Prompt Lab.
-
-
From the deployment space:
-
Click the Test tab.
-
In the Input data field, add a prompt that follows the prompt pattern that your tuned model is trained to recognize, and then click Generate.
You can click View parameter settings to see the prompt parameters that are applied to the model by default. To change the prompt parameters, you must go to the Prompt Lab.
-
-
Learn more
Parent topic: Deploying foundation model assets