Using the AI Gateway to support APIs for AI applications
API Connect provides a UI wizard to create AI-aware APIs and products, plus integration with watsonx.ai to forward requests and manage responses.
The AI Gateway makes it easy for enterprises to manage access to API endpoints used by AI applications. The AI Gateway simplifies the integration of AI into new and existing OpenAPI 3.0 APIs in API Connect to access a set of operations exposed by watsonx.ai.
- Use cases
- There are two cases where you should use the AI Gateway:
- As a reverse proxy for an existing API
In this case, the API contract with the watsonx.ai (text inference) is unchanged.
- As a development tool for new APIs that use watsonx.ai
You can create APIs directly in API Connect and define policies in the API workflow to manage access to watsonx.ai.
- As a reverse proxy for an existing API
- Benefits
- With the API Connect
AI Gateway, you
can centrally manage the use of AI through policy enforcement, data encryption, masking of sensitive
data, access control, audit trails and more, in support of your compliance obligations.
Using the AI Gateway to manage your API access to watsonx.ai lets you add controls to the API execution through the following features:
- An interface to IBM Cloud Authentication
watsonx.ai is hosted in IBM Cloud and requires an API key for access; the API key is obtained and managed through an IBM Cloud ID. API Connect provides a policy that enables your API to authenticate with IBM Cloud using an API key and obtain an authorization token required for accessing watsonx.ai.
- An interface to watsonx.aiAPI Connect provides a policy that enables your API to send requests to watsonx.ai. The following watsonx.ai specs are supported:
/text_generation
/text_tokenization
/foundation_model_specs
- Response caching
The response to an API call to watsonx.ai are cached, which improves response time for API calls and provides cost optimization for the API provider. When you create an API in API Connect, you can specify the duration for that API's response caching.
- Rate limiting
The AI Gateway enforces defined rate limits on APIs and API Plans (which manage access to individual APIs). You can configure rate limits based on the number of requests or generated tokens allowed within a particular time interval.
- Tokenization
Tokens are used as a unit of cost for LLM APIs. With the AI Gateway, rate limits can use the number of tokens generated by a request as a means of limiting usage and thus controlling costs. The token limit determines how many tokens are allowed to pass thru the gateway within a specified period of time.
- Predefined policies and logic structures
The AI Gateway can leverage all of the existing policies and logic constructs provided by API Connect (for example, invoke, redact, and validate) in the API execution flow. In particular, API Connect provides the IBM Cloud Authentication and watsonx.ai invoke policies for authenticating with, and accessing watsonx.ai.
- UI wizard for creating APIs
API Connect provides a UI wizard for creating an API to access operations exposed by watsonx.ai. The wizard creates an API with a configuration and assembly that applies token-based rate limiting, handles authentication with the watsonx.ai endpoint, and invokes the watsonx.ai API.
- API analytics
API Connect provides an analytics service that provides insights into AI Gateway usage. You can use the analytics data on the AI usage dashboard to manage costs and track performance. For more information, see Accessing analytics in the API Manager UI.
- An interface to IBM Cloud Authentication
Prerequisites for using the AI Gateway
Before attempting to use the AI Gateway, complete the following prerequisites:
- Obtain an IBM Cloud API key.
When you register for an IBM Cloud ID, an API key is automatically assigned. You can create additional API keys if needed. An API key is used by the IBM Cloud Authentication user defined policy to generate a Bearer token, which is used to authenticate for the desired watsonx.ai service. For information on managing API key, see Understanding API keys in the IBM Cloud documentation.
- Create a watsonx.ai project.
A watsonx.ai project is required for the user that is identified by the API key. The project will have a project ID. For information on creating projects, see Creating a project in the watsonx.ai documentation.
- Link the watsonx.ai project to an LLM service by completing the
following steps:
- In the project, click Services and Integrations in the navigation list.
- Create a service; for example, you might call it "Watson Machine Learning".
- Associate that service with your project.
Failure to configure your project correctly will result in a
400 Bad Request
response for attempted watsonx.ai operations.
Getting started with the AI Gateway
To use the AI Gateway, complete the following steps:
- Set up your environment as explained in Prerequisites for using the AI Gateway.
- Create an API to use as a reverse-proxy.
- Add the IBM Cloud Authentication and watsonx.ai invoke policies to the API so it can authenticate with IBM Cloud and access watsonx.ai.
- Create a custom product for the API, and include a plan that contains the required
watson-ai-default
andwatson-ai-infer-text
assembly rate limits, and then use that product for publishing the API. For instructions, see Creating a custom product for an AI Gateway API.Attention: The auto-generated product for the API does not include a plan with the required assembly rate limits, and it cannot be modified -- you must create a custom product and publish the API with the custom product. - Review metrics on the API's performance in the AI usage dashboard, which tracks AI token and model usage.