Planning and prerequisites
Go through the CAS installation prerequisites before you install the service.
General requirements
- Ensure that you have IBM Fusion 2.10.0
version.Note: CAS service installation can be performed only if IBM Fusion is installed by using the default
ibm-spectrum-fusion-nsnamespace. If the IBM Fusion installation is performed by using a custom namespace, the CAS installation cannot be completed. - Resource requirements
- NVIDIA requirements. See NVIDIA requirements.
- IBM Storage Scale requirements. See IBM Storage Scale requirements.
- Configure Scale remote filesystems. See Configure Scale remote file systems.
- Ensure that you have the latest IBM Storage Scale 5.2.3.1 version.
- After at least one IBM Storage Scale remote filesystem
is configured, in the OpenShift® console, go to
and
verify whether the storage class created from the IBM Storage Scale remote filesystem is available for CAS. For example, if you enter
ibm-spectrum-fusion, verify whether it is available. - Configure NIMs. See Configuring NVIDIA Inference Microservices (NIMs).
Configuring flags in CasInstall CR
To use the NVIDIA re-ranker service,
you can add the NVMM_NEMO_RANKER and NVMM_NEMO_RANKER_SERVICE
flags to the CasInstall CR.
For more information, see Configuring NVIDIA Inference Microservices (NIMs).
Allowlist for proxy
apigee.googleapis.com: Using Apigee Hybrid runtime, you can learn about proxies, shared flows, and other key components. It also provides information about configuration and system health.apigeeconnect.googleapis.com: Needed forapigee-mart-serverandapigee-connectcommunication whenVPC-SCis enabled.binaryauthorization.googleapis.com: Optional, only for Anthos if the binary authorization is enabled.gcr.io: Google Container Registry where the container images are hosted.raw.githubusercontent.com: To install the operator manifest.
Resource requirements
Work with IBM representative to ensure you have all the hardware components that are necessary for the CAS solution. Allocate three OpenShift compute nodes.
- The following resources are required for CAS to run on the three compute nodes:
- 3 x Nvidia L40S or H100 GPUs
- An additional L40S or H100 GPU is needed for the optional re-ranker service
- 166 vCPU (83 physical cores)
- 1024GB Memory
- 1TB IBM Storage Scale
- Most common IBM Storage Scale ESS configuration is 48 NVMe drives with the capacity of 30TB in the ESS 6000 with 4 CX-7 network adapters (1.5 TB memory)
- CAS service system
requirements:
Component vCPUs Memory Storage GPU Content-Aware Storage ( CAS) - Starter Configuration: <12TB Ingested Data: 160 vCPU (SMT=2)
- >12TB Ingested Data with High Availability: 320 vCPU
- Starter configuration: 768 GiB
- 12TB Ingested Data with High Availability: 2560 GiB
The minimum required filesystem size is 200 GB. For more information about IBM Spectrum Scale Container Native Storage Access, see Hardware requirements for IBM Storage Scale Container Native Storage Access. Starter configuration <12TB Ingested Data:- 1 GPU worker node
- 2 non-GPU worker nodes
- Production with Multi-Instance GPU (MIG) enabled: 2 MIG Capable NVIDIA GPUs (A100 80GB, H100, RTX PRO 6000)
- Production without Multi-Instance GPU (MIG) enabled: 6 NVIDIA GPUs (L40S, A100 40GB, A100 80GB, H100, H200, RTX PRO 6000)
- Non-production with NVIDIA time slicing enabled: 2 NVIDIA GPUs (L40S, A10G)
>12TB Ingested Data with High Availability:- 2 GPU worker nodes
- 1 non-GPU worker node
- Production with Multi-Instance GPU (MIG) enabled: 4 MIG Capable NVIDIA GPUs (A100 80GB, H100, RTX PRO 6000)
For more information about MIG, see Configuring a Multi-Instance GPU (MIG) with CAS.
NVIDIA requirements
- The NVIDIA GPU operator, multi-modal RAG
blueprint, and NVIDIA NIM operator are pulled from
external NVIDIA NGC registry. As part of this
process, you must provide NVIDIA NGC license
keys to pull the NVIDIA components.
NVIDIA NIM is a set of optimized cloud-native microservices designed to simplify the deployment of generative AI models anywhere, across cloud, data center, and GPU-accelerated workstations. For more information about the procedure to install, see NVIDIA documentation.
- Deploy NVIDIA GPU operator 25.3.1.
- Optional: NVIDIA text
re-ranking:
The NVIDIA text re-ranker is an optional component that requires an additional supported GPU. This component takes the initial set of search results and uses additional scoring from NVIDIA to return the top ranked results. For more information about support Matrix for NeMo Retriever Text Reranking NIM, see NVIDIA documentation.
- Contact IBM for the following details:
- For optimization and performance of L40S or H100 GPU utilization of the nv-ingest multi-modal PDF Ingest blueprint.
- To configure
nv-ingestNIM components with Multi-Instance GPU (MIG) for optimal use of supported GPUs or with time slicing. For more information, see Configuring a Multi-Instance GPU (MIG) with CAS. - Additional tuning parameters required to optimize pipeline performance.
- Compatibility matrix for NVIDIA's Multimodal NV-Ingest Blueprint with CAS
-
Content-Aware Storage (version) NVIDIA Multimodal NV-Ingest Blueprint (nv-ingest) v1.0.2 v25.3, v25.4 v1.0.4 v25.6.2
- Configuring
nv-ingestv25.6.2 with CAS v1.0.4 - To ensure compatibility and stability, you must update your cluster and deployment settings when
using
nv-ingestv25.6.2 with CAS v1.0.4.- To retrieve the current configuration, run the following
command:
oc get KubeletConfig - To increase the pod PID limit, modify the
kubeletConfigby adding the followingpodPidsLimitto the spec section:spec: kubeletConfig: podPidsLimit: 12228Note: After updating the configuration, all worker nodes in the cluster get restarted which can cause a temporary disruption.For more information on changing the
podPidsLimitwhen using the ROSA environment, see the Red Hat documentation. - To edit the deployment, run the following
command:
oc edit deployment -n nv-ingest nv-ingest - Add the
INGEST_DISABLE_DYNAMIC_SCALINGenvironment variable to the list of environments and set its value as "true".spec: containers: - env: - name: INGEST_DISABLE_DYNAMIC_SCALING value: "true"
- To retrieve the current configuration, run the following
command:
IBM Storage Scale requirements
- Active File Management (AFM) is required to be configured on IBM Storage Scale remote file system. For more information, see Configuring Active File Management.
- IBM Storage Scale cluster must be at version 5.2.3.1 or above.
Configure Scale remote file systems
- When you setup the Global Data Platform user to create the
connection to the remote filesystem, the user must to be part of either the
CsiAdminorContainerOperatorgroups.However, these groups do not have enough privileges to allow watcher creation. As a prerequisite to enable successful watcher creation, add these users to the
StorageAdmingroup. Create a user with all the privileges needed as a prerequisite. For more information about this configuration, see Configuring Scale to enable watch creation. - After the remote file system installation is complete, mark your storage class as default. Edit
annotation beside the storage class:
For example,pass key: storageclass.kubernetes.io/is-default-class Value: Trueibm-spectrum-fusionstorage class.