Requirements for creating a custom foundation model on MIG-enabled cluster
Review the hardware and software requiremnts for deploying custom foundation models on MIG-enabled clusters.
Hardware requirements
Review these hardware requirements for deploying your custom foundation models on a GPU-enabled cluster.
GPU hardware requirements
You can deploy custom foundation models on MIG-enabled clusters for the following NVIDIA GPU hardware specifications:
- NVIDIA
A100
with 80 GB of GPU memory - NVIDIA
H100
with 80 GB of GPU memory
The following hardware configuration cannot be used to deploy a custom foundation model on an MIG-enabled cluster:
- NVIDIA
L40S
with 48 GB of GPU memory
Hardware specifications
You must create a custom hardware specification to deploy your custom foundation model on an MIG-enabled cluster. Pre-defined hardware specifications (WX-S
, WX-M
, WX-L
and WX-XL
) can be used
only with dedicated GPU nodes (NVIDIA A100
or H100
with 80GB GPU memory). You cannot use the predefined hardware specifications for deployment.
Software requirements
You can deploy custom foundation models on MIG-enabled clusters for vLLM
model runtime only. Use the software specification watsonx-cfm-caikit-1.1
for deployment.
Supported foundation models
You can deploy the following custom foundation models on an MIG-enabled cluster:
Model family | Model |
---|---|
llama |
meta-llama/Meta-Llama-3-8B |
databricks |
databricks/dolly-v2-12b |
granite |
granite-3b-code-instruct |
gpt_neox |
rinna/japanese-gpt-neox-small |
gpt_bigcode |
bigcode/gpt_bigcode-santacoder |
phi |
microsoft/phi-1_5, microsoft/phi-2 |
mpt |
mosaicml/mpt-7b-storywriter |
falcon |
tiiuae/falcon-7b |
mixtral |
TheBloke/Mixtral-8x7B-v0.1-GPTQ |
The following foundation models are not supported for deploying custom foundation models on MIG-enabled clusters with vLLM
runtime.
Model family | Model |
---|---|
llama |
llama-2-70b-chat -hf |
t5 |
flan-t5-xl |
mixtral |
TheBloke/Mixtral-8x7B-v0.1-GPTQ |
gptj |
nomic-ai/gpt4all-j |
mistral |
mistralai/Mistral-7B-v0.3 |
To learn more about the foundation models that are supported for each of these architectures, see Supported foundation models for vLLM runtime.
Learn more
Parent topic: Deploying a custom foundation model