Model gateway (preview)

You can securely access and interact with foundation models from multiple model providers through the model gateway. The model gateway provides an OpenAI-compatible API that routes requests to foundation models from various model providers.

Use the model gateway to efficiently switch between multiple model providers by routing and formatting requests through a unified interface. You can build and deploy AI agents, RAG patterns, and more by using the gateway models.

Note:

The model gateway feature is in preview and available in the Toronto region only.

The model gateway is certified to access models from the following foundation model providers:

IBM watsonx.ai
OpenAI
Azure OpenAI
Anthropic
AWS Bedrock
Cerebras
NVIDIA NIM

Requirements

Access to model gateway is managed by an administrator who can grant permissions to users as needed.

To set up the model gateway, ensure the following account-level prerequisites are met:

The model gateway can only be configured by an API key that is associated with an administrator IBM Cloud account.

To access and use the model gateway, ensure that the following requirements are met in IBM Cloud Identity and Access Management (IAM) by an IBM Cloud administrator or account owner:

Users must be assigned administrator permissions to all watsonx.ai Runtime instances to use the model gateway.
A SecretsReader service role or higher must be assigned on the Secrets Manager instance.

Capabilities

You can use model gateway with the following capabilities:

Secure management of access providers: Integrate with IBM Cloud Secrets Manager to securely store and manage API keys and other sensitive configuration data. Secrets Manager securely manages access credentials between the model providers that you select and watsonx.ai. You can integrate with IBM Cloud Identity and Access Management (IAM) to enforce access control over who can retrieve and manage these secrets.
Access to multiple model providers: Connect to various model providers through a single, unified interface. With an OpenAI-compatible API endpoint, you can interact with different models by using a consistent request format. Built-in load balancing distributes requests across available model to optimize performance and prevent overload. Accessing multiple providers gives flexibility to integrate models based on your use case and accelerates testing and deployment without requiring changes to existing codebase.
Configure custom model endpoints: Deploy and manage custom foundation models by configuring endpoints through the model gateway. Custom endpoints provide secure and scalable integration of custom models into your applications.
List configured providers and models: Choose from a catalog of supported foundation models within watsonx.ai or other listed model providers. Users can list all configured model providers, view all models available for a specific provider, or display all models across providers.

Ways to work

You can use the following methods:

Programmatically: You can setup the model gateway programmatically by using the watsonx.ai REST API. To interact with the model gateway, you can use both the REST API or Python SDKs.

To work with the watsonx.ai REST API, see Developer resources.

To create AI services by using a OpenAI compatible provider, see the Use watsonx, and Model Gateway to run as AI service with load balancing sample notebook. To build LLM apps that route requests to providers by using the LangGraph framework and Model Gateway, see LangGraph Agent Template.
Through the Resource hub: You can use the model gateway to configure models that are hosted by third-party providers and make them accessible in the Resource hub. For details , see Managing gateway models in the Resource hub.

Note: Models added through the model gateway are not enabled for use in the Prompt Lab or Tuning Studio.

Workflow

The following diagram illustrates the workflow to set up the model gateway inference models through the gateway:

Diagram that shows IBM watsonx.ai model gateway workflow

As an administrator, perform the following steps to set up and use the model gateway:

Create a Secrets Manager service instance and allow the watsonx.ai Runtime service instance to access the Secrets Manager. Store the foundation model provider credentials in the Secrets Manager.
Configure foundation model providers through the model gateway by using the credentials stored in Secrets Manager.
Add models for each configured model provider.
Inference foundation models accessible through the gateway.