Deploying custom foundation models on dedicated GPUs

You can deploy a custom foundation model on dedicated GPU clusters. Review the requirements and supported model architectures for deploying custom foundation models on dedicated GPUs.

Prerequisites

Before you can deploy your custom foundation model, the System administrator must prepare the model and upload it to PVC storage. After storing the model, the admin must register the model with watsonx.ai.

Tasks for deployment

You must follow these tasks to deploy custom foundation models on dedicated GPUs.

  1. Review requirements: Review requirements to deploy custom foundation models on GPU-enabled clusters.
  2. Optional: Create a custom hardware specification: Create a custom hardware specification to deploy your custom foundation model.
  3. Deploy custom foundation model asset: Deploy your custom foundation model asset as an online or batch deployment. If you are deploying your custom foundation model asset with REST API, you must create a repository asset for your custom foundation model in Watson Machine Learning.
  4. Test custom foundation model deployment: Test your deployed AI service for online inferencing or batch scoring.
  5. Manage custom foundation model deployment: Access and update the deployment details. You can also scale or delete the deployment.

Learn more

Parent topic: Deploying custom foundation models on GPU-enabled clusters