Deploying custom foundation models on dedicated GPUs
You can deploy a custom foundation model on dedicated GPU clusters. Review the requirements and supported model architectures for deploying custom foundation models on dedicated GPUs.
Prerequisites
Before you can deploy your custom foundation model, the System administrator must prepare the model and upload it to PVC storage. After storing the model, the admin must register the model with watsonx.ai.
Tasks for deployment
You must follow these tasks to deploy custom foundation models on dedicated GPUs.
- Review requirements: Review requirements to deploy custom foundation models on GPU-enabled clusters.
- Optional: Create a custom hardware specification: Create a custom hardware specification to deploy your custom foundation model.
- Deploy custom foundation model asset: Deploy your custom foundation model asset as an online or batch deployment. If you are deploying your custom foundation model asset with REST API, you must create a repository asset for your custom foundation model in Watson Machine Learning.
- Test custom foundation model deployment: Test your deployed AI service for online inferencing or batch scoring.
- Manage custom foundation model deployment: Access and update the deployment details. You can also scale or delete the deployment.
Learn more
Parent topic: Deploying custom foundation models on GPU-enabled clusters