Planning and prerequisites

Go through the Content-Aware Storage (CAS) installation prerequisites before you install the service.

General requirements

Ensure that you meet the following prerequisites for the CAS solution:
  • Ensure that you have IBM Fusion 2.13.0 version.
    Important:
    • CAS service installation can be performed only if IBM Fusion is installed by using the default ibm-spectrum-fusion-ns namespace. If the IBM Fusion installation is performed by using a custom namespace, the CAS installation cannot be completed.
    • IBM Fusion Access for Storage Area Network (SAN), Global Data Platform, and CAS are mutually exclusive services. If IBM Fusion Access for SAN is planned or installed, you cannot deploy the Global Data Platform or CAS services. Review your architectural requirements and deployment goals before installing these services.
  • Resource requirements
  • * IBM Storage Scale requirements. See IBM Storage Scale requirements.
  • * Configure Scale remote filesystems. See Configure Scale remote file systems.
  • * Ensure that you have the latest IBM Storage Scale 5.2.3.x version.
  • * After at least one IBM Storage Scale remote filesystem is configured, in the OpenShift® console, go to Storage > Storage Class and verify whether the storage class created from the IBM Storage Scale remote filesystem is available for CAS. For example, if you enter ibm-spectrum-fusion, verify whether it is available.
  • Configure NIMs. See Configuring NVIDIA Inference Microservices (NIMs).

* Skip these requirements if you are installing CAS without a remotely mounted IBM Storage Scale filesystem.

Configuring flags in CAS ConfigMap

To use the NVIDIA re-ranker service, you can add the NVMM_NEMO_RANKER and NVMM_NEMO_RANKER_SERVICE flags to the cas-config ConfigMap.

For more information, see Configuring NVIDIA Inference Microservices (NIMs).

Allowlist for proxy

Add the following URLs to the allowlist for proxy installation:
  • apigee.googleapis.com: Using Apigee Hybrid runtime, you can learn about proxies, shared flows, and other key components. It also provides information about configuration and system health.
  • apigeeconnect.googleapis.com: Needed for apigee-mart-server and apigee-connect communication when VPC-SC is enabled.
  • binaryauthorization.googleapis.com: Optional, only for Anthos if the binary authorization is enabled.
  • gcr.io: Google Container Registry where the container images are hosted.
  • raw.githubusercontent.com: To install the operator manifest.

Resource requirements

Work with IBM representative to ensure you have all the hardware components that are necessary for the CAS solution. Allocate three OpenShift compute nodes.

  • The following resources are required for CAS to run on the three compute nodes:
    • 3 x NVIDIA L40S or H100 GPUs
    • An additional L40S or H100 GPU is needed for the optional re-ranker service.
    • 166 vCPU (83 physical cores)
    • 1024GB memory
    • 1TB IBM Storage Scale
    • Most common IBM Storage Scale ESS configuration is 48 NVMe drives with the capacity of 30TB in the ESS 6000 with 4 CX-7 network adapters (1.5 TB memory).
    • When using the Docling Multimodal Processing Engine, see System requirements.
  • For CAS service system requirements, see System requirements for each of the components in addition to the resource requirements.
    Note: Content-Aware Storage (CAS) depends upon IBM Fusion management software and the Global Data Platform service.

Enabling GPUs in OpenShift

To enable GPUs in the OpenShift cluster, following components must be installed to support GPU workloads:

To deploy GPU operators in a disconnected or airgapped environment, see the NVIDIA documentation.

NVIDIA Multimodal Document Processing Engine requirements

  • The NVIDIA GPU operator, multi-modal RAG blueprint, and NVIDIA NIM operator are pulled from external NVIDIA NGC registry. As part of this process, you must provide NVIDIA NGC license keys to pull the NVIDIA components.

    NVIDIA NIM is a set of optimized cloud-native microservices that are designed to simplify the deployment of generative AI models anywhere, across cloud, data center, and GPU-accelerated workstations. Initially, you must deploy the NVIDIA NeMo Retriever. For more information about the procedure to deploy, see NVIDIA documentation.

  • Optional: NVIDIA text reranking:

    The NVIDIA text reranker is an optional component that requires an additional supported GPU. This component takes the initial set of search results and uses additional scoring from NVIDIA to return the top ranked results. For more information about deploying NeMo Retriever text reranking NIM, see NVIDIA documentation.

  • Contact IBM for the following details:
    • For optimization and performance of L40S or H100 GPU utilization of the NVIDIA NeMo Retriever Library multimodal PDF Ingest blueprint.
    • To configure NeMo Retriever Library NIM components with Multi-Instance GPU (MIG) for optimal use of supported GPUs or with time slicing. For more information, see Configuring a Multi-Instance GPU (MIG) with CAS.
    • Additional tuning parameters required to optimize pipeline performance.
Compatibility matrix for NVIDIA's NeMo Retriever Library Multimodal blueprint with CAS
Content-Aware Storage (version) NVIDIA NeMo Retriever Library Multimodal blueprint
v1.0.6, v1.0.7, v1.1.0 v25.9.0
v1.1.1, v1.1.2, v1.1.3, v1.1.4 v26.1.2
v1.1.5 and newer v26.3.0

IBM Storage Scale requirements

If your datasource is on S3:
  • Active File Management (AFM) is required to be configured on IBM Storage Scale remote file system. For more information, see Configuring Active File Management.
  • IBM Storage Scale cluster must be at version 5.2.3.1 or above.

Configure Scale remote file systems

To configure remote file systems, see Connecting to remote IBM Storage Scale file systems. For software requirement levels, see Software requirement levels.
Important:
  • When you setup the Global Data Platform user to create the connection to the remote file system, the user must be part of either the CsiAdmin or ContainerOperator groups.

    However, these groups do not have enough privileges to allow watcher creation. As a prerequisite to enable successful watcher creation, add these users to the StorageAdmin group. Create a user with all the privileges needed as a prerequisite. For more information about this configuration, see Configuring IBM Storage Scale user to enable watch creation.

  • After the remote file system installation is complete, mark your storage class, for example, ibm-spectrum-fusion, as default, by running the following command:
    oc patch storageclass ibm-spectrum-fusion -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'

Offline installation

Content-Aware Storage supports usage of the Docling Multimodal document processing engine in an offline environment. For more information about the procedure to deploy offline, see Mirroring Content-Aware Storage (CAS) images.