Setting up the model gateway programmatically

Store configuration details for various model providers in IBM Cloud Secrets Manager and then add models by using an API that is compatible with OpenAI.

Before you begin

  • Generate credentials to authenticate with watsonx.ai APIs. For details, see Generating a bearer token.
  • Get credentials for each supported model provider that you plan to use.
  • Get the list of available model providers and their UUIDs. For details, see Listing providers and models.

Procedure

  1. Define and create a secret in the IBM Software Hub vault for a model provider that you want to configure. If you use an external vault, specify the arguments for the model provider when you create the secret. For more information, see Managing secrets and vaults.

    The following table shows the configuration requirements for each supported model provider. For more details, see the respective provider’s inference documentation.

    Table 1. Supported model providers keys that are needed to set up their secrets
    Provider name Required arguments Optional arguments Notes
    OpenAI apikey base_url
    IBM watsonx.ai base_url apikey
    project_id
    space_id
    auth_url
    api_version
    Supports project_id, space_id per-request auth via headers (X-IBM-Project-Id, X-IBM-Space-Id)
    Azure OpenAI apikey
    resource_name
    api_version
    subscription_id
    resource_group_name
    account_name
    api_version defaults to 2024-10-21; subscription/resource group/account needed for model listing
    Anthropic apikey
    AWS Bedrock access_key_id
    secret_access_key
    region
    session_token
    base_url
    Cerebras apikey
    Cohere apikey
    Groq apikey
    Mistral apikey
    NVIDIA NIM apikey
    Ollama host keep_alive
    clean_on_close
    Self-hosted/local deployment; no authentication; uses custom Ollama API format (not OpenAI-compatible); keep_alive defaults to 5 minutes
    xAI apikey
    Google Gemini apikey

    Set the environment variable Vault_URN. You can copy the Vault_URN from the Administration > Configurations and settings > Vaults and secrets page by clicking the Copy icon Copy icon next to your secret name. For example, see the following command:

    export Vault_URN="<user-id>:<secret-name>"
    
  2. Run the following REST API request to configure a model provider:

      curl -sS https://cpd-<namespace-name>.apps.<OCP-domain>/ml/gateway/v1/providers/<provider> \
        -H "Authorization: Bearer ${TOKEN}" \
        -H "Content-Type: application/json" \
        -d "$(jq -n \
          --arg resource "${Vault_URN}" \
          --arg name "<custom-name-for-provider>" \
          '{name: $name, data_reference: {resource: $resource}}')"
    

    If the internal vault is enabled, you can also configure credentials directly by running the following command:

      curl --request POST \
        --url https://cpd-<namespace-name>.apps.<OCP-domain>/ml/gateway/v1/providers/<provider>  \
        -H 'Authorization: Bearer ${TOKEN}' \
        -H 'Content-Type: application/json' \
        --data '{
        "data": {
          "apikey": "<model-provider-api-key>"
        },
        "name": "<custom-name-for-model-provider>"
      }'
    
  3. After a model provider is added, you can add models from that provider by using the provider’s UUID and the model’s ID in the request. The model ID must be an existing unique identifier that is recognized by the provider. Since some models are available from multiple providers, you can use model aliases, which are custom names that reference models instead of using their model IDs. For example, see the following command:

    curl -X POST "https://cpd-<namespace-name>.apps.<OCP-domain>/ml/gateway/v1/providers/${PROVIDER_UUID}/models" \
      -H "Content-Type: application/json" \
      -H "Authorization: Bearer ${TOKEN}" \
      -d '{ "alias": "<custom-name-for-model>", "id": "<model_id>"}'
    

    For more details on each supported model provider, see the watsonx.ai API reference documentation.

What to do next

You can now send requests to models through the model gateway. For details, see Inferencing gateway models. You can also manage existing connections and models, enable load balancing, create access policies, and set rate limits.