IBM Cloud® has a broad range of NVIDIA GPUs such as H200 and L40S to best fit with your specific needs and AI workloads, like training, inferencing or fine-tuning. The GPUs support a large range of generative AI inferencing applications, capabilities and frameworks, including large language models (LLM) and multi-modal models (MMM). Get your AI workload into product quickly based on your workload placement goals with multi-platform enablement, including IBM Cloud Virtual Servers for VPC, IBM watsonx®, Red Hat® RHEL AI or OpenShift® AI and Deployable Architectures.
NVIDIA GPUs are paired with 4th Gen Intel® Xeon® processors on IBM Cloud Virtual Servers for VPC. There are several ways to adopt and deploy based on your infrastructure and software requirements.
NVIDIA GPUs can be deployed through IBM Cloud Virtual Servers for VPC cloud instances. IBM Cloud VPC is designed for high resiliency and security inside a software-defined network (SDN) where clients can build isolated private clouds while maintaining essential public cloud benefits. NVIDIA GPU cloud instances, which also supports Red Hat Enterprise Linux AI (RHEL AI) images, are ideal for clients with highly specialized software stacks, or those who require full control over their underlying server.
Clients requiring full control over their entire AI stack, from infrastructure to workload, can deploy IBM watsonx.ai® to their NVIDIA GPU-based virtual server on IBM Cloud VPC. IBM watsonx.ai is a one-stop, integrated, end-to-end AI development studio that features an AI developer toolkit and full AI lifecycle management for developing AI services and deploying them into your applications of choice.
Clients who want the freedom to choose AI frameworks while also helping ensure rapid-secure deployment of their AI workloads can use our Deployable Architectures of NVIDIA GPUs on IBM Cloud.
Red Hat OpenShift AI is a flexible, scalable artificial intelligence (AI) and machine learning (ML) platform that enables enterprises to create and deliver AI-enabled applications at scale across hybrid cloud environments. Built using open source technologies, OpenShift AI provides trusted, operationally consistent capabilities for teams to experiment, serve models and deliver innovative apps.
Cluster your NVIDIA GPU instances over a 3.2 Tbps network with RoCE v2 support
| GPU | VCPU | RAM | Configure | ||
|---|---|---|---|---|---|
| NVIDIA H200 GPU - For large traditional AI and generative AI models | 8 X NVIDIA H200 141 GB | 160 | 1792 GiB | Virtual Server for VPC Red Hat OpenShift | |
| NVIDIA H100 GPU - For large traditional AI and generative AI models | 8 X NVIDIA H100 80 GB | 160 | 1792 GiB | Virtual Server for VPC Red Hat OpenShift | |
| NVIDIA A100-PCIe GPU - for traditional AI and generative AI models | 1 x NVIDIA A100 80 GB 2 x NVIDIA A100 80 GB | 24 48 | 120 GB 240 GB | Virtual Server for VPC Red Hat OpenShift | |
| NVIDIA L40S GPU - For small to mid-size models | 1 X NVIDIA L40S 48 GB 2 X NVIDIA L40S 48 GB | 24 48 | 120 GB 240 GB | Virtual Server for VPC Red Hat OpenShift | |
| NVIDIA L4 GPU - For small AI models that require smaller memory | 1 X NVIDIA L4 24 GB 2 X NVIDIA L4 24 GB 4 X NVIDIA L4 24 GB | 16 32 64 | 80 GB 160 GB 320 GB | Virtual Server for VPC Red Hat OpenShift | |
| NVIDIA V100 GPU - For small AI footprint to start with | 1 X NVIDIA V100 16 GB | 8 | 64 GiB | Virtual Server for VPC Red Hat OpenShift |