Installing IBM Model Gateway
If you intend to use large language models (LLM) with your IBM® Business Automation Workflow deployment, then it is recommended that you install and use IBM Model Gateway. The Model Gateway provides a unified interface for managing and routing requests to LLMs with built-in OpenTelemetry configuration. OpenTelemetry is the open standard for telemetry.
Before you begin
The following prerequisites are needed to install the Model Gateway.
- CLI tools (
kubectl,helm,oc). - Kubernetes cluster connectivity.
- Storage class availability.
- A Business Automation Workflow on containers deployment is already installed.
- An external PostgreSQL server.
Before you configure an AI Provider with the Model Gateway, you must create Kubernetes secrets to store its credentials.
The following example shows the required provider secret structure:
apiVersion: v1
kind: Secret
metadata:
name: model-gateway-provider-secret
type: Opaque
data:
openai-key: <base64-encoded-json>
watsonx-key: <base64-encoded-json>
Each API Provider key must contain the provider-specific credentials. The following example shows
the Base64 encoded JSON for the openai-key parameter:
openai-key: eyJhcGlLZXkiOiAic2stKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqWlhZIiwidXJsIjogImh0dHBzOi8vYXBpLm9wZW5haS5jb20vdjEifQ==
The decoded JSON contains the apiKey and url parameters with
the specific values.
{
"apiKey": "sk-*********************************ZXY",
"url": "https://api.openai.com/v1"
}
About this task
The baw-model-gateway-deployment.sh script automates the deployment of the
Model Gateway
operator for Business Automation Workflow environments. The script handles the complete deployment lifecycle, including prerequisites
verification, operator installation, and instance configuration.
The Zen service from Cloud Pak foundational services is installed when you install your Business Automation Workflow deployment, and provides the authentication and UI integration for the Model Gateway.
- OpenAI: GPT models
- AWS Bedrock: Multiple model providers
- Azure OpenAI: Azure-hosted OpenAI models
- IBM watsonx.ai: IBM foundation models
- Google Vertex AI: Google AI models
The baw-model-gateway-deployment.sh script can be run interactively or with the
required parameters in the command:
- Interactive installation (recommended)
-
./baw-model-gateway-deployment.sh -n cp4ba-instance [-o cp4ba-operators]Wherecp4ba-instanceis the namespace of your Business Automation Workflow deployment.cp4ba-operatorsis the namespace of your Business Automation Workflow operators.
The script requires an external PostgreSQL database for standard deployments. The script can generate the PostgreSQL secret from a property file if the Kubernetes secret does not exist.
- Command parameters
-
Table 1. Listing of the script options to include in the command Parameter Help message Default value -h, --helpDisplay help message - -o, --operator-namespaceOperator namespace cp4ba-operators -n, --instance-namespaceInstance namespace cp4ba-instance -s, --scale-configScale: small|medium|large|small_mincpureq small -b, --block-storage-classBlock storage class for Redis Required -f, --file-storage-classFile storage class Required -v, --storage-vendorStorage vendor: ocs|portworx Optional -p, --pull-secretImages pull secret name ibm-entitlement-key -r, --registryImage registry cp.icr.io -l, --licenseLicense type: Enterprise|Standard Enterprise --accept-licenseAccept license agreement Required --lite-installUse SQLite (dev/test only) false --redis-channelRedis operator channel Note: For small-scale deployments, Redis is disabled.v1.4 --verify-onlyVerify prerequisites only - --use-custom-tlsEnable custom TLS flow - --configure-providersConfigure AI providers - --uninstallUninstall operator and instance - If custom TLS is enabled, the script supports two models:- Use an existing secret in the namespace that contains tls.crt and tls.key files.
- Generate a server certificate from the CP4BA root CA secret. The generated certificate uses the
following default values:
- CA secret:
icp4a-root-ca - Generated TLS secret:
model-gateway-custom-tls - Default server CN:
model-gateway-service.<instance-namespace>.svc
- CA secret:
The Helm deployment includes the generated or validated TLS secret.
Procedure
Results
Use the Grafana provided dashboards that visualize the Prometheus metrics and OpenTelemetry traces.
The Model Gateway collects metrics to monitor its performance and health and exports them to Prometheus.
- Request count
- Number of requests by provider, model, and tenant.
- Latency
- Request processing time.
- Error rate
- Percentage of failed requests by provider and tenant.
- Token usage
- Number of tokens that are used by tenant and model.
- Cache hit and miss ratio
- Percentage of cache hits and misses.
Tracing provides the following information.
- End-to-end request traces
- Trace requests from client to provider and back.
- Span attributes
- Provider, model, tenant, and request ID.
- Errors
- Capture error details in spans.
Logging provides the following information.
- Log levels
debug,info,warn, anderror.- Context enrichment
- Request ID, tenant ID, and user ID.
- Sensitive data redaction
- API keys, credentials, and PII.
- JSON format
- Machine-readable logs.
The Model Gateway provides several dashboards for monitoring.
- Provider health
- Status of each provider.
- Tenant usage
- Metrics by tenant.
- System resources
- CPU, memory, and network usage.
- Error rates
- Errors by provider, tenant, and endpoint.
What to do next
You can configure the Model Gateway programmatically by using the watsonx.ai REST API. For more information, see Setting up the model gateway with code. To inference foundation models, you can use the watsonx.ai REST API or the OpenAI Python SDK. For more information, see Inferencing models through the model gateway.
You can then enable generative AI in your Business Automation Workflow deployments by using the Model Gateway to access multiple LLM providers. For more information, see Enabling generative AI through the model gateway.
If the Model Gateway is not working as expected, then use the following troubleshooting steps to resolve any issues:
- Run the describe command on the Model Gateway.
oc describe modelgateway modelgateway-cr -n cp4ba-instance - Check all the events in the
namespace.
oc get events -n cp4ba-instance --sort-by='.lastTimestamp' - Check the status of Redis.
oc get rediscp -n cp4ba-instance -o yaml - Check the PVCs.
oc get pvc -n cp4ba-instance
- Add AI Providers
-
To configure AI Providers with your deployment, run the script with the
--configure-providersparameter. The--configure-providersparameter enables:- Interactive provider configuration with guided prompts.
- Configuration merges that preserve existing settings.
- Automatic CR patching with validation.
- Pod restart after successful reconciliation.
To add providers to an existing deployment, run the following command.
./baw-model-gateway-deployment.sh \ --configure-providers --instance-namespace cp4ba-instanceAfter the CR is patched, the script checks for the status to be complete and provides confirmation.
status.modelgatewayStatus == "Completed" status.progress == "100%"If the status does not reach 100% within 5 minutes, the script skips the pod restart. You can manually restart the CR when the reconciliation completes by running the following command.
oc rollout restart deployment/model-gateway -n <ns> - Uninstall the Model Gateway
-
To uninstall the Model Gateway, run the script with the
--uninstallparameter../baw-model-gateway-deployment.sh \ --uninstall \ -o cp4ba-operators \ -n cp4ba-instanceThe uninstall process completes the following actions:- Deletes the Model Gateway CR.
- Waits for resources to be cleaned up.
- Deletes the Model Gateway
secrets.
[INFO] Deleting Model Gateway secrets... [✔] Deleted secret: model-gateway-postgres-external-secret [✔] Deleted secret: model-gateway-provider-secret - Uninstalls the operator Helm release.
- Uninstalls the CRD.