Registering a custom foundation model

After creating a PVC and copying the custom foundation model into the PVC, a cluster administrator must register the custom foundation model to make it available for deployment and inferencing with watsonx.ai.

Service The required watsonx.ai service and other supplemental services are not available by default. An administrator must install these services on the IBM Cloud Pak for Data platform. To determine whether a service is installed, open the Services catalog and check whether the service is enabled.

To perform this task you must be a Cloud Pak for Data administrator.

Registering your model

To register a custom foundation model:

  1. Log in to OpenShift and then edit the Watsonxaiifm custom resource (CR) file. To register a new model you have to append a model entry in this file:

    oc edit Watsonxaiifm
    
  2. Add a model entry under spec.custom_foundation_models and enter the following details:

    Model parameters
    Field Type Description Mandatory parameter
    model_id String Specify the ID of the custom foundation model Yes
    location Object Specify the location of the custom foundation model Yes
    tags String Provide additional metadata about the model No
    parameters Object Specify the parameters of the model No

    For more information, see Properties and parameters for your custom foundation model.

    For example:

    apiVersion: watsonxaiifm.cpd.ibm.com/v1beta1
    kind: Watsonxaiifm
    metadata:
    name: watsonxaiifm-cr
    ......
    spec:
      ignoreForMaintenance: false
      .......
      custom_foundation_models:
      - location:
          pvc_name: example_model_pvc
        model_id: example_model_70b
        parameters:
        - default: float16
          name: dtype
          options:
          - float16
          - bfloat16
        - default: 256
          max: 512
          min: 16
          name: max_batch_size
        - default: 64
          max: 128
          min: 0
          name: max_concurrent_requests
        - default: 2048
          max: 8192
          min: 256
          name: max_sequence_length
        - default: 2048
          max: 4096
          min: 512
          name: max_new_tokens
        tags:
        - example_model
        - 70b
      - location:
          pvc_name: example_model_pvc_13b
        model_id: example_model_13b
    

    After registering the custom foundation model in the CR, wait for two minutes to allow the operator to reconcile.

Next steps

Creating a deployment for a custom foundation model

Parent topic: Deploying custom foundation models