Installing IBM Model Gateway

Edit online

If you intend to use large language models (LLM) with your IBM® Business Automation Workflow deployment, then it is recommended that you install and use IBM Model Gateway. The Model Gateway provides a unified interface for managing and routing requests to LLMs with built-in OpenTelemetry configuration. OpenTelemetry is the open standard for telemetry.

Before you begin

The following prerequisites are needed to install the Model Gateway.

CLI tools (kubectl, helm, oc).
Kubernetes cluster connectivity.
Storage class availability.
A Business Automation Workflow on containers deployment is already installed.
An external PostgreSQL server.

Before you configure an AI Provider with the Model Gateway, you must create Kubernetes secrets to store its credentials.

The following example shows the required provider secret structure:

apiVersion: v1
kind: Secret
metadata:
  name: model-gateway-provider-secret
type: Opaque
data:
  openai-key: <base64-encoded-json>
  watsonx-key: <base64-encoded-json>

Each API Provider key must contain the provider-specific credentials. The following example shows the Base64 encoded JSON for the openai-key parameter:

openai-key: eyJhcGlLZXkiOiAic2stKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqWlhZIiwidXJsIjogImh0dHBzOi8vYXBpLm9wZW5haS5jb20vdjEifQ==

The decoded JSON contains the apiKey and url parameters with the specific values.

{
  "apiKey": "sk-*********************************ZXY",
  "url": "https://api.openai.com/v1"
}

About this task

The baw-model-gateway-deployment.sh script automates the deployment of the Model Gateway operator for Business Automation Workflow environments. The script handles the complete deployment lifecycle, including prerequisites verification, operator installation, and instance configuration.

The Zen service from Cloud Pak foundational services is installed when you install your Business Automation Workflow deployment, and provides the authentication and UI integration for the Model Gateway.

The Model Gateway supports configuration for multiple AI providers, including:

OpenAI: GPT models
AWS Bedrock: Multiple model providers
Azure OpenAI: Azure-hosted OpenAI models
IBM watsonx.ai: IBM foundation models
Google Vertex AI: Google AI models

The baw-model-gateway-deployment.sh script can be run interactively or with the required parameters in the command:

Interactive installation (recommended)

./baw-model-gateway-deployment.sh -n cp4ba-instance [-o cp4ba-operators]

Where

cp4ba-instance is the namespace of your Business Automation Workflow deployment.
cp4ba-operators is the namespace of your Business Automation Workflow operators.

The script requires an external PostgreSQL database for standard deployments. The script can generate the PostgreSQL secret from a property file if the Kubernetes secret does not exist.

Command parameters

Table 1. Listing of the script options to include in the command
Parameter	Help message	Default value
`-h, --help`	Display help message	-
`-o, --operator-namespace`	Operator namespace	cp4ba-operators
`-n, --instance-namespace`	Instance namespace	cp4ba-instance
`-s, --scale-config`	Scale: small\|medium\|large\|small_mincpureq	small
`-b, --block-storage-class`	Block storage class for Redis	Required
`-f, --file-storage-class`	File storage class	Required
`-v, --storage-vendor`	Storage vendor: ocs\|portworx	Optional
`-p, --pull-secret`	Images pull secret name	ibm-entitlement-key
`-r, --registry`	Image registry	cp.icr.io
`-l, --license`	License type: Enterprise\|Standard	Enterprise
`--accept-license`	Accept license agreement	Required
`--lite-install`	Use SQLite (dev/test only)	false
`--redis-channel`	Redis operator channel Note: For small-scale deployments, Redis is disabled.	v1.4
`--verify-only`	Verify prerequisites only	-
`--use-custom-tls`	Enable custom TLS flow	-
`--configure-providers`	Configure AI providers	-
`--uninstall`	Uninstall operator and instance	-

If custom TLS is enabled, the script supports two models:

Use an existing secret in the namespace that contains tls.crt and tls.key files.
Generate a server certificate from the CP4BA root CA secret. The generated certificate uses the following default values:
- CA secret: icp4a-root-ca
- Generated TLS secret: model-gateway-custom-tls
- Default server CN: model-gateway-service.<instance-namespace>.svc

The Helm deployment includes the generated or validated TLS secret.

Procedure

Create Kubernetes secrets in the operator and instance namespaces to pull the required container images.

oc create secret docker-registry ibm-entitlement-key \
  --docker-server=cp.icr.io \
  --docker-username=cp \
  --docker-password=<your-entitlement-key> \
  -n <namespace>

Optional: Create a secret for your external PostgreSQL instance. If you do not want the script to create the secret from a property file, then create it manually.
The Kubernetes secret name that is used by the deployment must be model-gateway-postgres-external-secret, and the required secret keys are host, port, username, password, dbname, and parameters.
```
oc create secret generic model-gateway-postgres-external-secret \
  --from-literal=host=<postgres-host> \
  --from-literal=port=<postgres-port> \
  --from-literal=username=<db-username> \
  --from-literal=password=<db-password> \
  --from-literal=dbname=<database-name> \
  --from-literal=parameters='sslmode=require' \
  -n <instance-namespace>
```
Note: The supported SSL modes are disable, require, verify-ca, and verify-full.
If verify-ca or verify-full is used, the CA certificate must exist. If client certificate authentication is enabled, both the client certificate and the client key must exist.
Run the baw-model-gateway-deployment.sh script to verify all the prerequisites are installed.
Run the following command from the scripts folder of your downloaded cert-kubernetes.
```
./baw-model-gateway-deployment.sh --verify-only
```
The script checks the prerequisites:
- Checks for required CLI tools (kubectl, helm, oc).
- Verifies Kubernetes cluster connectivity.
- Validates namespace existence.
- Checks for image pull secrets.
- Verifies storage class availability.
Run the baw-model-gateway-deployment.sh script to install the Model Gateway.
Run the following command from the scripts folder of your downloaded cert-kubernetes.
```
./baw-model-gateway-deployment.sh -n <instance-namespace>
```
The interactive mode requires an instance namespace and then takes the following actions:
1. Reads the CP4BA common configmap (ibm-cp4ba-common-config), which exists in your Business Automation Workflow <instance-namespace> to determine whether operators and services use separate namespaces.
2. Prompts for license acceptance.
3. Prompts for storage classes.
4. Prompts for scale configuration.
5. Optionally configures custom TLS.
6. Optionally configures AI providers.
7. Prompts for the external PostgreSQL details.
  Option 1: Enter the secret that you created (model-gateway-postgres-external-secret).
  
  Option 2: If you want the script to create the secret, enter a property file (cp4ba_model_gateway.property) that contains values for the following properties.
```
MODEL_GATEWAY_POSTGRES_HOST
MODEL_GATEWAY_POSTGRES_PORT
MODEL_GATEWAY_POSTGRES_USERNAME
MODEL_GATEWAY_POSTGRES_PASSWORD
MODEL_GATEWAY_POSTGRES_DBNAME
MODEL_GATEWAY_POSTGRES_SSL_MODE
MODEL_GATEWAY_POSTGRES_CA_CERT_PATH
MODEL_GATEWAY_POSTGRES_USE_CLIENT_CERT
MODEL_GATEWAY_POSTGRES_CLIENT_CERT_PATH
MODEL_GATEWAY_POSTGRES_CLIENT_KEY_PATH
```
8. Shows a deployment summary.
9. Prompts for final confirmation.
The following example shows the command with the required options.
```
./baw-model-gateway-deployment.sh \
  --accept-license \
  -o cp4ba-operators \
  -n cp4ba-instance \
  -b ocs-storagecluster-ceph-rbd \
  -f ocs-storagecluster-cephfs \
  -s medium
```
Tip: Use the -o cp4ba-operators parameter only if you chose to install the Business Automation Workflow operators and the Business Automation Workflow deployments in separate namespaces (separation of duties).

The following example shows the command with custom namespaces.
```
./baw-model-gateway-deployment.sh \
  --accept-license \
  -o my-operators \
  -n my-instance \
  -b portworx-db \
  -f portworx-shared \
  -s large
```
The following example shows the command for a development or test deployment with SQLite.
```
./baw-model-gateway-deployment.sh \
  -n cp4ba-instance \
  --accept-license \
  --lite-install
```
The following example shows the command to enable custom TLS.
```
./baw-model-gateway-deployment.sh \
  -n cp4ba-instance \
  --accept-license \
  --use-custom-tls \
  -b ocs-storagecluster-ceph-rbd \
  -f ocs-storagecluster-cephfs
```
The following example shows the command with AI Provider configuration.
```
./baw-model-gateway-deployment.sh \
  -n cp4ba-instance \
  --accept-license \
  --configure-providers \
  -b ocs-storagecluster-ceph-rbd \
  -f ocs-storagecluster-cephfs
```
The configure-providers parameter instructs the script to prompt you to set the configuration for the selected AI providers. Select your choices from the available options, and provide the Kubernetes secret that is needed to make the connection.
1. Select the AI providers that you want.
2. Enter the URL and API key (credentials) for each selected provider.
3. Configure the models that you want to use from the selected providers.

Optional: Monitor the progress of the installation.

Watch the status.

oc get modelgateway modelgateway-cr -n cp4ba-instance -w

Check the operator logs.

oc logs -n cp4ba-operators \
  -l control-plane=ibm-cpd-model-gateway-operator \
  --tail=100 -f

Check the Model Gateway logs.

oc logs -n cp4ba-instance \
  -l app=model-gateway \
  --tail=100 -f

Verify that the deployment is ready.

Check the status of the Model Gateway deployment.
```
oc get modelgateway -n cp4ba-instance
```

Check all the pods.

oc get pods -n cp4ba-instance -l app=model-gateway

Check the PostgreSQL secret.

oc get secret model-gateway-postgres-external-secret -n <instance-namespace>

Check the AI Provider secret (if configured).

oc get secret model-gateway-provider-secret -n <instance-namespace>

Check the Redis cluster.
```
oc get rediscp -n cp4ba-instance
```

Access the Model Gateway service.

oc get svc model-gateway -n cp4ba-instance

Results

Use the Grafana provided dashboards that visualize the Prometheus metrics and OpenTelemetry traces.

Remember: The Model Gateway implements a fine-grained ACL system with deny-all by default. Users have no access to any actions or resources unless an admin explicitly grants permissions. All permissions are scoped down to the user and tenant level, controlling access to specific operations and provider types.

The Model Gateway collects metrics to monitor its performance and health and exports them to Prometheus.

Request count: Number of requests by provider, model, and tenant.
Latency: Request processing time.
Error rate: Percentage of failed requests by provider and tenant.
Token usage: Number of tokens that are used by tenant and model.
Cache hit and miss ratio: Percentage of cache hits and misses.

Tracing provides the following information.

End-to-end request traces: Trace requests from client to provider and back.
Span attributes: Provider, model, tenant, and request ID.
Errors: Capture error details in spans.

Logging provides the following information.

Log levels: debug, info, warn, and error.
Context enrichment: Request ID, tenant ID, and user ID.
Sensitive data redaction: API keys, credentials, and PII.
JSON format: Machine-readable logs.

The Model Gateway provides several dashboards for monitoring.

Provider health: Status of each provider.
Tenant usage: Metrics by tenant.
System resources: CPU, memory, and network usage.
Error rates: Errors by provider, tenant, and endpoint.

What to do next

You can configure the Model Gateway programmatically by using the watsonx.ai REST API. For more information, see Setting up the model gateway with code. To inference foundation models, you can use the watsonx.ai REST API or the OpenAI Python SDK. For more information, see Inferencing models through the model gateway.

You can then enable generative AI in your Business Automation Workflow deployments by using the Model Gateway to access multiple LLM providers. For more information, see Enabling generative AI through the model gateway.

If the Model Gateway is not working as expected, then use the following troubleshooting steps to resolve any issues:

Run the describe command on the Model Gateway.

oc describe modelgateway modelgateway-cr -n cp4ba-instance

Check all the events in the namespace.

oc get events -n cp4ba-instance --sort-by='.lastTimestamp'

Check the status of Redis.

oc get rediscp -n cp4ba-instance -o yaml

Check the PVCs.
```
oc get pvc -n cp4ba-instance
```

Add AI Providers

To configure AI Providers with your deployment, run the script with the --configure-providers parameter. The --configure-providers parameter enables:

Interactive provider configuration with guided prompts.
Configuration merges that preserve existing settings.
Automatic CR patching with validation.
Pod restart after successful reconciliation.

To add providers to an existing deployment, run the following command.

./baw-model-gateway-deployment.sh \
--configure-providers
--instance-namespace cp4ba-instance

After the CR is patched, the script checks for the status to be complete and provides confirmation.

status.modelgatewayStatus == "Completed"
status.progress == "100%"

If the status does not reach 100% within 5 minutes, the script skips the pod restart. You can manually restart the CR when the reconciliation completes by running the following command.

oc rollout restart deployment/model-gateway -n <ns>

Uninstall the Model Gateway

To uninstall the Model Gateway, run the script with the --uninstall parameter.

./baw-model-gateway-deployment.sh \
  --uninstall \
  -o cp4ba-operators \
  -n cp4ba-instance

The uninstall process completes the following actions:

Deletes the Model Gateway CR.
Waits for resources to be cleaned up.

Deletes the Model Gateway secrets.

[INFO] Deleting Model Gateway secrets...
[✔] Deleted secret: model-gateway-postgres-external-secret
[✔] Deleted secret: model-gateway-provider-secret

Uninstalls the operator Helm release.
Uninstalls the CRD.