Planning and prerequisites
Go through the Content-Aware Storage (CAS) installation prerequisites before you install the service.
General requirements
- Ensure that you have IBM Fusion 2.13.0
version.Important:
- CAS service installation can be performed only if IBM Fusion is installed by using the default
ibm-spectrum-fusion-nsnamespace. If the IBM Fusion installation is performed by using a custom namespace, the CAS installation cannot be completed. - IBM Fusion Access for Storage Area Network (SAN), Global Data Platform, and CAS are mutually exclusive services. If IBM Fusion Access for SAN is planned or installed, you cannot deploy the Global Data Platform or CAS services. Review your architectural requirements and deployment goals before installing these services.
- CAS service installation can be performed only if IBM Fusion is installed by using the default
- Resource requirements
- * IBM Storage Scale requirements. See IBM Storage Scale requirements.
- * Configure Scale remote filesystems. See Configure Scale remote file systems.
- * Ensure that you have the latest IBM Storage Scale 5.2.3.x version.
- * After at least one IBM Storage Scale remote filesystem
is configured, in the OpenShift® console, go to
and
verify whether the storage class created from the IBM Storage Scale remote filesystem is available for CAS. For example, if you enter
ibm-spectrum-fusion, verify whether it is available. - Configure NIMs. See Configuring NVIDIA Inference Microservices (NIMs).
* Skip these requirements if you are installing CAS without a remotely mounted IBM Storage Scale filesystem.
Configuring flags in CAS ConfigMap
To use the NVIDIA re-ranker service,
you can add the NVMM_NEMO_RANKER and NVMM_NEMO_RANKER_SERVICE
flags to the cas-config ConfigMap.
For more information, see Configuring NVIDIA Inference Microservices (NIMs).
Allowlist for proxy
apigee.googleapis.com: Using Apigee Hybrid runtime, you can learn about proxies, shared flows, and other key components. It also provides information about configuration and system health.apigeeconnect.googleapis.com: Needed forapigee-mart-serverandapigee-connectcommunication whenVPC-SCis enabled.binaryauthorization.googleapis.com: Optional, only for Anthos if the binary authorization is enabled.gcr.io: Google Container Registry where the container images are hosted.raw.githubusercontent.com: To install the operator manifest.
Resource requirements
Work with IBM representative to ensure you have all the hardware components that are necessary for the CAS solution. Allocate three OpenShift compute nodes.
- The following resources are required for CAS to run on the three compute nodes:
- 3 x NVIDIA L40S or H100 GPUs
- An additional L40S or H100 GPU is needed for the optional re-ranker service.
- 166 vCPU (83 physical cores)
- 1024GB memory
- 1TB IBM Storage Scale
- Most common IBM Storage Scale ESS configuration is 48 NVMe drives with the capacity of 30TB in the ESS 6000 with 4 CX-7 network adapters (1.5 TB memory).
- When using the Docling Multimodal Processing Engine, see System requirements.
- For CAS service system
requirements, see System requirements for each of the components in addition to the resource requirements.Note: Content-Aware Storage (CAS) depends upon IBM Fusion management software and the Global Data Platform service.
Enabling GPUs in OpenShift
- Red Hat’s Node Feature Discovery Operator - See the Red Hat documentation.
- NVIDIA’s GPU operator version - See the NVIDIA documentation.
To deploy GPU operators in a disconnected or airgapped environment, see the NVIDIA documentation.
NVIDIA Multimodal Document Processing Engine requirements
- The NVIDIA GPU operator, multi-modal RAG
blueprint, and NVIDIA NIM operator are pulled from
external NVIDIA NGC registry. As part of this
process, you must provide NVIDIA NGC license
keys to pull the NVIDIA components.
NVIDIA NIM is a set of optimized cloud-native microservices that are designed to simplify the deployment of generative AI models anywhere, across cloud, data center, and GPU-accelerated workstations. Initially, you must deploy the NVIDIA NeMo Retriever. For more information about the procedure to deploy, see NVIDIA documentation.
- Optional: NVIDIA text
reranking:
The NVIDIA text reranker is an optional component that requires an additional supported GPU. This component takes the initial set of search results and uses additional scoring from NVIDIA to return the top ranked results. For more information about deploying NeMo Retriever text reranking NIM, see NVIDIA documentation.
- Contact IBM for the following details:
- For optimization and performance of L40S or H100 GPU utilization of the NVIDIA NeMo Retriever Library multimodal PDF Ingest blueprint.
- To configure NeMo Retriever Library NIM components with Multi-Instance GPU (MIG) for optimal use of supported GPUs or with time slicing. For more information, see Configuring a Multi-Instance GPU (MIG) with CAS.
- Additional tuning parameters required to optimize pipeline performance.
- Compatibility matrix for NVIDIA's NeMo Retriever Library Multimodal blueprint with CAS
-
Content-Aware Storage (version) NVIDIA NeMo Retriever Library Multimodal blueprint v1.0.6, v1.0.7, v1.1.0 v25.9.0 v1.1.1, v1.1.2, v1.1.3, v1.1.4 v26.1.2 v1.1.5 and newer v26.3.0
IBM Storage Scale requirements
- Active File Management (AFM) is required to be configured on IBM Storage Scale remote file system. For more information, see Configuring Active File Management.
- IBM Storage Scale cluster must be at version 5.2.3.1 or above.
Configure Scale remote file systems
- When you setup the Global Data Platform user to create the
connection to the remote file system, the user must be part of either the
CsiAdminorContainerOperatorgroups.However, these groups do not have enough privileges to allow watcher creation. As a prerequisite to enable successful watcher creation, add these users to the
StorageAdmingroup. Create a user with all the privileges needed as a prerequisite. For more information about this configuration, see Configuring IBM Storage Scale user to enable watch creation. - After the remote file system installation is complete, mark your storage class, for example,
ibm-spectrum-fusion, as default, by running the following command:oc patch storageclass ibm-spectrum-fusion -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
Offline installation
Content-Aware Storage supports usage of the Docling Multimodal document processing engine in an offline environment. For more information about the procedure to deploy offline, see Mirroring Content-Aware Storage (CAS) images.