Enabling GPU metrics computation
You can increase the speed of generating evaluation results by enabling data safety, answer quality, and retrieval quality metrics computation to run on GPUs.
Procedure
You must complete the following steps to enable metrics computation to run on GPUs:
-
Log into the cluster.
oc login -u <user> -p <password> - Go to the project that contains the
watsonxaiifmandopenscalecustom resources.oc project <project_name> - Patch the
watsonxaiifm-crcustom resource to run the detectors on GPUs.
If you want to perform node pinning to run the detectors in GPU nodes, run the following command:oc patch watsonxaiifm watsonxaiifm-cr -p '{"spec":{"fms_detector_hap_replicas":1,"fms_detector_hap_resources":{"limits":{"cpu":"1","memory":"4Gi","nvidia.com/gpu":"1"},"requests":{"cpu":"100m","memory":"2Gi","nvidia.com/gpu":"1"}},"fms_detector_pii_replicas":1,"fms_detector_pii_resources":{"limits":{"cpu":"1","memory":"4Gi","nvidia.com/gpu":"1"},"requests":{"cpu":"100m","memory":"2Gi","nvidia.com/gpu":"1"}},"answer_relevance_replicas":1,"answer_relevance_resources":{"limits":{"cpu":"1","memory":"4Gi","nvidia.com/gpu":"1"},"requests":{"cpu":"100m","memory":"2Gi","nvidia.com/gpu":"1"}},"context_relevance_replicas":1,"context_relevance_resources":{"limits":{"cpu":"1","memory":"4Gi","nvidia.com/gpu":"1"},"requests":{"cpu":"100m","memory":"2Gi","nvidia.com/gpu":"1"}},"faithfulness_replicas":1,"faithfulness_resources":{"limits":{"cpu":"1","memory":"4Gi","nvidia.com/gpu":"1"},"requests":{"cpu":"100m","memory":"2Gi","nvidia.com/gpu":"1"}}}}' --type=mergeoc patch watsonxaiifm watsonxaiifm-cr -p '{"spec":{"detector_overrides":{"hap":{"nodeSelector":{"kubernetes.io/hostname":"<worker_hostname>"}},"pii":{"nodeSelector":{"kubernetes.io/hostname":"<worker_hostname>"}},"answer_relevance":{"nodeSelector":{"kubernetes.io/hostname":"<worker_hostname>"}},"context_relevance":{"nodeSelector":{"kubernetes.io/hostname":"<worker_hostname>"}},"faithfulness":{"nodeSelector":{"kubernetes.io/hostname":"<worker_hostname>"}}}}}' --type=merge - Verify that the
watsonxaiifm-crpatch completes successfully.oc get watsonxaiifm watsonxaiifm-cr - Patch the
openscalecustom resource to enable the flag to leverage the enabled models.oc patch woservice aiopenscale -p '{"spec": {"mcm": {"use_gr_api":"True", "enable_metric_parallellism":"True"} }}' --type=merge - Verify that the
openscalepatch completes successfully.oc get woservice aiopenscale