Requirements for creating a custom foundation model on MIG-enabled cluster

Review the hardware and software requiremnts for deploying custom foundation models on MIG-enabled clusters.

Hardware requirements

Review these hardware requirements for deploying your custom foundation models on a GPU-enabled cluster.

GPU hardware requirements

You can deploy custom foundation models on MIG-enabled clusters for the following NVIDIA GPU hardware specifications:

  • NVIDIA A100 with 80 GB of GPU memory
  • NVIDIA H100 with 80 GB of GPU memory

The following hardware configuration cannot be used to deploy a custom foundation model on an MIG-enabled cluster:

  • NVIDIA L40S with 48 GB of GPU memory

Hardware specifications

You must create a custom hardware specification to deploy your custom foundation model on an MIG-enabled cluster. Pre-defined hardware specifications (WX-S, WX-M, WX-L and WX-XL) can be used only with dedicated GPU nodes (NVIDIA A100 or H100 with 80GB GPU memory). You cannot use the predefined hardware specifications for deployment.

Software requirements

You can deploy custom foundation models on MIG-enabled clusters for vLLM model runtime only. Use the software specification watsonx-cfm-caikit-1.1 for deployment.

Supported foundation models

You can deploy the following custom foundation models on an MIG-enabled cluster:

Supported foundation models for deploying custom foundatio models on MIG-enabled clusters
Model family Model
llama meta-llama/Meta-Llama-3-8B
databricks databricks/dolly-v2-12b
granite granite-3b-code-instruct
gpt_neox rinna/japanese-gpt-neox-small
gpt_bigcode bigcode/gpt_bigcode-santacoder
phi microsoft/phi-1_5, microsoft/phi-2
mpt mosaicml/mpt-7b-storywriter
falcon tiiuae/falcon-7b
mixtral TheBloke/Mixtral-8x7B-v0.1-GPTQ

The following foundation models are not supported for deploying custom foundation models on MIG-enabled clusters with vLLM runtime.

Foundation models that are not supported for deployment on MIG-enabled clusters
Model family Model
llama llama-2-70b-chat -hf
t5 flan-t5-xl
mixtral TheBloke/Mixtral-8x7B-v0.1-GPTQ
gptj nomic-ai/gpt4all-j
mistral mistralai/Mistral-7B-v0.3

To learn more about the foundation models that are supported for each of these architectures, see Supported foundation models for vLLM runtime.

Learn more

Parent topic: Deploying a custom foundation model