Configuring NVIDIA Inference Microservices (NIMs)
After you install CAS, the
cas-config configmap must be created. It provides the nv-ingest
endpoint information.
To determine the
nv-ingest endpoint information, you must know the namespace in
which nv-ingest is installed. The endpoints are constructed by using the pattern
http(s)://..svc.cluster.local. When you use the default installation namespace of
nv-ingest and because it does not support https by default, it
results in the following URLs:- Embed service URL:
http://llama-32-nv-embedqa-1b-v2.nv-ingest.svc.cluster.local - Ingest service URL:
http://nv-ingest.nv-ingest.svc.cluster.local
Note: These URLs must not end with the
/ character.To complete this activity, use the oc command line tool and run the following
command:
oc apply -f - <<EOF
apiVersion: v1
kind: ConfigMap
metadata:
name: cas-config
namespace: ibm-cas
data:
NVMM_EMBED_SERVICE: <Embed Service URL as determined above>
NVMM_NIM_SERVICE: <Ingest Service URL as determined above>
EOFNVIDIA re-ranker
Optionally, you can set NVMM_NEMO_RANKER in the CasInstall CR
to enable the NVIDIA re-ranker service.
This service analyzes semantic relevance and reorders search results to improve precision in
enterprise search and streamline AI-driven workflows. For more information, see the NVIDIA documentation.
- Re-ranker service URL:
http://llama-32-nv-rerankqa-1b-v2.nv-ingest.svc.cluster.local
Note: The service URL must not end with the
/ character.To enable the NVIDIA re-ranker service,
set the NVMM_NEMO_RANKER flag to YES and the
NVMM_NEMO_RANKER_SERVICE flag to the re-ranker service URL in the
cas-config ConfigMap:
spec:
oc patch configmap cas-config \
-n ibm-cas \
--type merge \
-p '{"data":{"NVMM_NEMO_RANKER": "YES", "NVMM_NEMO_RANKER_SERVICE": "<Re-ranker service URL as determined above>"}}'