Using the AI Gateway to support APIs for AI applications

API Connect provides a UI wizard to create AI-aware APIs and products, plus integration with watsonx.ai to forward requests and manage responses.

The AI Gateway makes it easy for enterprises to manage access to API endpoints used by AI applications. The AI Gateway simplifies the integration of AI into new and existing OpenAPI 3.0 APIs in API Connect to access a set of operations exposed by watsonx.ai.

Note: The AI Gateway is not a separate gateway service and requires no additional installation. AI Gateway is a feature of the DataPower API Gateway service in API Connect, and provides a single point of control for organizations to access AI services through APIs.
Use cases
There are two cases where you should use the AI Gateway:
  • As a reverse proxy for an existing API

    In this case, the API contract with the watsonx.ai (text inference) is unchanged.

  • As a development tool for new APIs that use watsonx.ai

    You can create APIs directly in API Connect and define policies in the API workflow to manage access to watsonx.ai.

Benefits
With the API Connect AI Gateway, you can centrally manage the use of AI through policy enforcement, data encryption, masking of sensitive data, access control, audit trails and more, in support of your compliance obligations.

Using the AI Gateway to manage your API access to watsonx.ai lets you add controls to the API execution through the following features:

  • An interface to IBM Cloud Authentication

    watsonx.ai is hosted in IBM Cloud and requires an API key for access; the API key is obtained and managed through an IBM Cloud ID. API Connect provides a policy that enables your API to authenticate with IBM Cloud using an API key and obtain an authorization token required for accessing watsonx.ai.

  • An interface to watsonx.ai
    API Connect provides a policy that enables your API to send requests to watsonx.ai. The following watsonx.ai specs are supported:
    • /text_generation
    • /text_tokenization
    • /foundation_model_specs
  • Response caching

    The response to an API call to watsonx.ai are cached, which improves response time for API calls and provides cost optimization for the API provider. When you create an API in API Connect, you can specify the duration for that API's response caching.

  • Rate limiting

    The AI Gateway enforces defined rate limits on APIs and API Plans (which manage access to individual APIs). You can configure rate limits based on the number of requests or generated tokens allowed within a particular time interval.

  • Tokenization

    Tokens are used as a unit of cost for LLM APIs. With the AI Gateway, rate limits can use the number of tokens generated by a request as a means of limiting usage and thus controlling costs. The token limit determines how many tokens are allowed to pass thru the gateway within a specified period of time.

  • Predefined policies and logic structures

    The AI Gateway can leverage all of the existing policies and logic constructs provided by API Connect (for example, invoke, redact, and validate) in the API execution flow. In particular, API Connect provides the IBM Cloud Authentication and watsonx.ai invoke policies for authenticating with, and accessing watsonx.ai.

  • UI wizard for creating APIs

    API Connect provides a UI wizard for creating an API to access operations exposed by watsonx.ai. The wizard creates an API with a configuration and assembly that applies token-based rate limiting, handles authentication with the watsonx.ai endpoint, and invokes the watsonx.ai API.

  • API analytics

    API Connect provides an analytics service that provides insights into AI Gateway usage. You can use the analytics data on the AI usage dashboard to manage costs and track performance. For more information, see Accessing analytics in the API Manager UI.

Prerequisites for using the AI Gateway

Before attempting to use the AI Gateway, complete the following prerequisites:

  • Obtain an IBM Cloud API key.

    When you register for an IBM Cloud ID, an API key is automatically assigned. You can create additional API keys if needed. An API key is used by the IBM Cloud Authentication user defined policy to generate a Bearer token, which is used to authenticate for the desired watsonx.ai service. For information on managing API key, see Understanding API keys in the IBM Cloud documentation.

  • Create a watsonx.ai project.

    A watsonx.ai project is required for the user that is identified by the API key. The project will have a project ID. For information on creating projects, see Creating a project in the watsonx.ai documentation.

  • Link the watsonx.ai project to an LLM service by completing the following steps:
    1. In the project, click Services and Integrations in the navigation list.
    2. Create a service; for example, you might call it "Watson Machine Learning".
    3. Associate that service with your project.

    Failure to configure your project correctly will result in a 400 Bad Request response for attempted watsonx.ai operations.

Getting started with the AI Gateway

To use the AI Gateway, complete the following steps:

  1. Set up your environment as explained in Prerequisites for using the AI Gateway.

  2. Create an API to use as a reverse-proxy.

  3. Add the IBM Cloud Authentication and watsonx.ai invoke policies to the API so it can authenticate with IBM Cloud and access watsonx.ai.

  4. Create a custom product for the API, and include a plan that contains the required watson-ai-default and watson-ai-infer-text assembly rate limits, and then use that product for publishing the API. For instructions, see Creating a custom product for an AI Gateway API.
    Attention: The auto-generated product for the API does not include a plan with the required assembly rate limits, and it cannot be modified -- you must create a custom product and publish the API with the custom product.
  5. Review metrics on the API's performance in the AI usage dashboard, which tracks AI token and model usage.