Configuring NVIDIA Inference Microservices (NIMs)
After you install CAS, the
cas-config configmap must be created. It provides the nv-ingest
endpoint information.
To determine the
nv-ingest endpoint information, you must know the namespace in
which nv-ingest is installed. The endpoints are constructed by using the pattern
http(s)://..svc.cluster.local. When you use the default installation namespace of
nv-ingest and because it does not support https by default, it
results in the following URLs:- Embed service: http://nv-ingest-embedqa.nv-ingest.svc.cluster.local
- Ingest service: http://nv-ingest.nv-ingest.svc.cluster.local
Note: These URLs must not end with the
/ character.To complete this activity, use the oc command line tool and run the following
command:
oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: cas-config
namespace: ibm-cas
data:
NVMM_EMBED_SERVICE: <Embed Service URL as determined above>
NVMM_NIM_SERVICE: <Ingest Service URL as determined above>
EOFNVIDIA re-ranker
Optionally, you can set NVMM_NEMO_RANKER in the CasInstall CR
to enable the NVIDIA re-ranker service. This service analyzes semantic relevance and reorders search
results to improve precision in enterprise search and streamline AI-driven workflows. For more
information, see the NVIDIA documentation.
- Re-ranker service: http://nemo-ranker-nvidia-nim-llama-32-nv-rerankqa-1b-v2.nv-ingest.svc.cluster.local
Note: This URL must not end with the
/ character.To enable the NVIDIA re-ranker service, set the NVMM_NEMO_RANKER flag to
YES and the NVMM_NEMO_RANKER_SERVICE flag to the re-ranker service
URL in the CASInstall CR spec section:
spec:
flags:
- NVMM_NEMO_RANKER=YES
- NVMM_NEMO_RANKER_SERVICE=http://nemo-ranker-nvidia-nim-llama-32-nv-rerankqa-1b-v2.nv-ingest.svc.cluster.local