Deploying watsonx.ai capabilities

You can access watsonx.ai services on the IBM Fusion HCI System.

To leverage watsonx.ai on premises, purchase IBM Fusion HCI System and install watsonx.ai application as part of IBM Software Hub. You can also install watsonx.ai on IBM Fusion HCI System by using the IBM container registry through an operator.

For more information about watsonx.ai, see https://www.ibm.com/products/watsonx-ai.

The watsonx.ai is supported on a single IBM Fusion HCI System with:
  • Up to two G02 nodes: Each node can have up to 3 x Nvidia A100 80GB GPUs (6 total Nvidia A100 80GB GPUs per rack).
  • Up to two G03 nodes: Each node can have up to 8 x Nvidia L40S 48GB GPUs (16 total Nvidia L40S GPUs per rack).

Optionally, you can add tolerances for watsonx.ai workloads to run on the GPU nodes.

Important: Do not install KubeVirt HyperConverged Cluster Operator. It can cause problems when you install IBM Software Hub software.
IBM Fusion HCI System is a hosting platform for watsonx.ai and it provides the following capabilities:
  • Leverage the watsonx.ai user interface and REST API interfaces.
  • Schedule regular online backups of a watsonx.ai instance to recover to an earlier point in time in the event of data loss or corruption.
  • Monitor the model serving and tuning pipeline KPIs by using built in OpenShift® monitoring capabilities.
  • Leverage the watsonx.ai health and performance monitoring through the IBM Software Hub common services.
  • Extend the Backup & Restore service recipes of IBM Fusion HCI System for the watsonx.ai component as well.
Note:
  • In this release, watsonx.ai support is not available for high-availability IBM Fusion HCI System.
  • The Llama-2-70b-chat model is not supported with the current GPU nodes available with the IBM Fusion HCI System due to the size of the foundation model.