Planning and prerequisites

Go through the CAS installation prerequisites before you install the service.

General requirements

Ensure that you meet the following prerequisites for the CAS solution:
  • Ensure that you have IBM Fusion 2.10.0 version.
    Note: CAS service installation can be performed only if IBM Fusion is installed by using the default ibm-spectrum-fusion-ns namespace. If the IBM Fusion installation is performed by using a custom namespace, the CAS installation cannot be completed.
  • Resource requirements
  • NVIDIA requirements. See NVIDIA requirements.
  • IBM Storage Scale requirements. See IBM Storage Scale requirements.
  • Configure Scale remote filesystems. See Configure Scale remote file systems.
  • Ensure that you have the latest IBM Storage Scale 5.2.3.1 version.
  • After at least one IBM Storage Scale remote filesystem is configured, in the OpenShift® console, go to Storage > Storage Class and verify whether the storage class created from the IBM Storage Scale remote filesystem is available for CAS. For example, if you enter ibm-spectrum-fusion, verify whether it is available.
  • Configure NIMs. See Configuring NVIDIA Inference Microservices (NIMs).

Configuring flags in CasInstall CR

To use the NVIDIA re-ranker service, you can add the NVMM_NEMO_RANKER and NVMM_NEMO_RANKER_SERVICE flags to the CasInstall CR.

For more information, see Configuring NVIDIA Inference Microservices (NIMs).

Allowlist for proxy

Add the following URLs to the allowlist for proxy installation:
  • apigee.googleapis.com: Using Apigee Hybrid runtime, you can learn about proxies, shared flows, and other key components. It also provides information about configuration and system health.
  • apigeeconnect.googleapis.com: Needed for apigee-mart-server and apigee-connect communication when VPC-SC is enabled.
  • binaryauthorization.googleapis.com: Optional, only for Anthos if the binary authorization is enabled.
  • gcr.io: Google Container Registry where the container images are hosted.
  • raw.githubusercontent.com: To install the operator manifest.

Resource requirements

Work with IBM representative to ensure you have all the hardware components that are necessary for the CAS solution. Allocate three OpenShift compute nodes.

  • The following resources are required for CAS to run on the three compute nodes:
    1. 3 x Nvidia L40S or H100 GPUs
    2. An additional L40S or H100 GPU is needed for the optional re-ranker service
    3. 166 vCPU (83 physical cores)
    4. 1024GB Memory
    5. 1TB IBM Storage Scale
    6. Most common IBM Storage Scale ESS configuration is 48 NVMe drives with the capacity of 30TB in the ESS 6000 with 4 CX-7 network adapters (1.5 TB memory)
  • CAS service system requirements:
    Component vCPUs Memory Storage GPU
    Content-Aware Storage ( CAS)
    • Starter Configuration: <12TB Ingested Data: 160 vCPU (SMT=2)
    • >12TB Ingested Data with High Availability: 320 vCPU
    • Starter configuration: 768 GiB
    • 12TB Ingested Data with High Availability: 2560 GiB
    The minimum required filesystem size is 200 GB. For more information about IBM Spectrum Scale Container Native Storage Access, see Hardware requirements for IBM Storage Scale Container Native Storage Access.
    Starter configuration <12TB Ingested Data:
    • 1 GPU worker node
    • 2 non-GPU worker nodes
    • Production with Multi-Instance GPU (MIG) enabled: 2 MIG Capable NVIDIA GPUs (A100 80GB, H100, RTX PRO 6000)
    • Production without Multi-Instance GPU (MIG) enabled: 6 NVIDIA GPUs (L40S, A100 40GB, A100 80GB, H100, H200, RTX PRO 6000)
    • Non-production with NVIDIA time slicing enabled: 2 NVIDIA GPUs (L40S, A10G)
    >12TB Ingested Data with High Availability:
    • 2 GPU worker nodes
    • 1 non-GPU worker node
    • Production with Multi-Instance GPU (MIG) enabled: 4 MIG Capable NVIDIA GPUs (A100 80GB, H100, RTX PRO 6000)

    For more information about MIG, see Configuring a Multi-Instance GPU (MIG) with CAS.

NVIDIA requirements

Note: NVIDIA NIM 25.3 supports OpenShift Container Platform 4.15.x, 4.16.x, and 4.17.x.
  • The NVIDIA GPU operator, multi-modal RAG blueprint, and NVIDIA NIM operator are pulled from external NVIDIA NGC registry. As part of this process, you must provide NVIDIA NGC license keys to pull the NVIDIA components.

    NVIDIA NIM is a set of optimized cloud-native microservices designed to simplify the deployment of generative AI models anywhere, across cloud, data center, and GPU-accelerated workstations. For more information about the procedure to install, see NVIDIA documentation.

  • Deploy NVIDIA GPU operator 25.3.1.
  • Optional: NVIDIA text re-ranking:

    The NVIDIA text re-ranker is an optional component that requires an additional supported GPU. This component takes the initial set of search results and uses additional scoring from NVIDIA to return the top ranked results. For more information about support Matrix for NeMo Retriever Text Reranking NIM, see NVIDIA documentation.

  • Contact IBM for the following details:
    • For optimization and performance of L40S or H100 GPU utilization of the nv-ingest multi-modal PDF Ingest blueprint.
    • To configure nv-ingest NIM components with Multi-Instance GPU (MIG) for optimal use of supported GPUs or with time slicing. For more information, see Configuring a Multi-Instance GPU (MIG) with CAS.
    • Additional tuning parameters required to optimize pipeline performance.
Compatibility matrix for NVIDIA's Multimodal NV-Ingest Blueprint with CAS
Content-Aware Storage (version) NVIDIA Multimodal NV-Ingest Blueprint (nv-ingest)
v1.0.2 v25.3, v25.4
v1.0.4 v25.6.2
Configuring nv-ingest v25.6.2 with CAS v1.0.4
To ensure compatibility and stability, you must update your cluster and deployment settings when using nv-ingest v25.6.2 with CAS v1.0.4.
  1. To retrieve the current configuration, run the following command:
    oc get KubeletConfig
  2. To increase the pod PID limit, modify the kubeletConfig by adding the following podPidsLimit to the spec section:
    spec:
      kubeletConfig:
        podPidsLimit: 12228
    Note: After updating the configuration, all worker nodes in the cluster get restarted which can cause a temporary disruption.

    For more information on changing the podPidsLimit when using the ROSA environment, see the Red Hat documentation.

  3. To edit the deployment, run the following command:
    oc edit deployment -n nv-ingest nv-ingest
  4. Add the INGEST_DISABLE_DYNAMIC_SCALING environment variable to the list of environments and set its value as "true".
    spec:
      containers:
      - env:
        - name: INGEST_DISABLE_DYNAMIC_SCALING
          value: "true"

IBM Storage Scale requirements

If your datasource is on S3:
  • Active File Management (AFM) is required to be configured on IBM Storage Scale remote file system. For more information, see Configuring Active File Management.
  • IBM Storage Scale cluster must be at version 5.2.3.1 or above.

Configure Scale remote file systems

To configure remote file systems, see Connecting to remote IBM Storage Scale file systems. For software requirement levels, see Software requirement levels.
Important:
  • When you setup the Global Data Platform user to create the connection to the remote filesystem, the user must to be part of either the CsiAdmin or ContainerOperator groups.

    However, these groups do not have enough privileges to allow watcher creation. As a prerequisite to enable successful watcher creation, add these users to the StorageAdmin group. Create a user with all the privileges needed as a prerequisite. For more information about this configuration, see Configuring Scale to enable watch creation.

  • After the remote file system installation is complete, mark your storage class as default. Edit annotation beside the storage class:
    pass key: storageclass.kubernetes.io/is-default-class Value: True
    For example, ibm-spectrum-fusion storage class.