Tuning a foundation model programmatically

You can programmatically tune a set of foundation models in watsonx.ai to customize the models for your use case.

You can use the programmatically fine tune a foundation model with the following techniques:

  • Full fine tuning
  • Low-rank adaptation (LoRA) fine tuning
  • Quantized low-rank adaptation (QLoRA) fine tuning

You can use any fine tuning method to tune custom foundation models.

Required permissions

To run tuning experiments, you must have the Admin or Editor role in a project.

Required credentials

You must generate credentials to authenticate with watsonx.ai APIs. For details, see Generating a bearer token.

Data format

Tabular: JSON, JSONL

Tabular data from supported data connections. For details, see Data formats.

Note: You can use the same training data with one or more tuning experiments.
Data size

50 to 10,000 input and output example pairs. The maximum file size is 200 MB.

Ways to develop

You can tune foundation models by using these programming methods:

Alternatively, you can use graphical tools from the watsonx.ai UI to tune foundation models. See Tuning Studio.

Supported foundation models

To get a list of foundation models that support full fine tuning, use the following API request:

curl -X GET \
  'https://cpd-<namespace-name>.apps.<OCP-domain>/ml/v1/foundation_model_specs?version=2025-02-20&filters=function_fine_tune_trainable'

To get a list of foundation models that support LoRA or QLoRA fine tuning, use the following API request:

curl -X GET \
  'https://cpd-<namespace-name>.apps.<OCP-domain>/ml/v1/foundation_model_specs?version=2025-02-20&filters=function_lora_fine_tune_trainable'

See Choosing a foundation model to tune.

You can use LoRA only with non-quantized models and QLoRA only with quantized models.

REST API

The high-level steps that you follow are mostly the same for each technique. The key differences are the values to include in the request body for the fine-tuning training job and are highlighted in the following procedure. For details about the REST API method details, see Fine tunings in the watsonx.ai API reference documentation.

  1. Create a training data file to use for tuning the foundation model.

    For more information about the training data file requirements, see Data formats for tuning foundation models.

  2. Make your training data file available for the API to use. You can create a data ass

    You can do one of the following things:

    You will use the asset ID and training data file details when you add the training_data_references section of the REST API request body.

  3. Use the watsonx.ai API to create a training experiment. See the Create a fine tuning job method.

    Submit the POST request to the following endpoint:

    curl --request POST 'https://cpd-<namespace-name>.apps.<OCP-domain>/ml/v1/fine_tunings?version=2025-02-14' \
      --header 'Accept: application/json' \
      --header 'Content-Type: application/json' \
      --header 'Authorization: Bearer ${TOKEN}' \
    

    Customize the experiment by setting values for various parameters in the FineTuningParameters payload. For details, see Choosing a model to tune and Parameters for tuning foundation models.

    Set auto_update_model to true to save the generated output as an asset that you can use when you deploy the tuned foundation model later. Otherwise, you must save the tuned model or adapters that are generated by the experiment to the repository service to generate an asset_id before you can use them in the deployment.

    The following sample request body creates a full fine-tuning experiment:

    {
      "project_id": "<project-id>",
      "name": "my fft experiment",
      "auto_update_model": true,
      "tuned_model_name": "my-fine-tuned-model",
      "parameters": {
        "base_model": {
          "model_id": "ibm/granite-3-1-8b-base" },
        "task_id": "classification",
        "num_epochs": 10,
        "learning_rate": 0.00001,
        "batch_size": 5,
        "max_seq_length": 1024,
        "accumulate_steps": 1,
        "gpu": {
          "num": 4
        }
      },
      "results_reference": {
        "location": {
          "path": "full_fine_tuning/results" },
        "type": "fs"
      },
      "training_data_references": [
        {
        "location": {
          "href":"/v2/assets/1e6591a2-c69d-4716-92e3-73e8c2270956project_id=<project-id>",
          "id":"1e6591a2-c69d-4716-92e3-73e8c2270956" },
        "type": "data_asset"
        }
      ]
    }
    

    The following sample shows the API request output:

    {
      "entity": {
        "auto_update_model": true,
        "parameters": {
          "accumulate_steps": 1,
          "base_model": {
            "model_id": "ibm/granite-3-1-8b-base"
          },
          "batch_size": 5,
          "gpu": {
            "num": 4
          },
          "learning_rate": 0.00001,
          "max_seq_length": 1024,
          "num_epochs": 10,
          "response_template": "\n### Response:",
          "task_id": "classification",
          "verbalizer": "### Input:  \n\n### Response: "
        },
        "results_reference": {
          "location": {
            "path": "/projects/<project-id>/assets/full_fine_tuning/results",
            "notebooks_path": "/projects/<project-id>/assets/full_fine_tuning/results/63e98673-a2c0-45c1-8ac6-e26a47ec1914/notebooks",
            "training": "/projects/<project-id>/assets/full_fine_tuning/results/63e98673-a2c0-45c1-8ac6-e26a47ec1914",
            "training_status": "/projects/<project-id>/assets/full_fine_tuning/results/63e98673-a2c0-45c1-8ac6-e26a47ec1914/training-status.json",
            "assets_path": "/projects/<project-id>/assets/full_fine_tuning/results/63e98673-a2c0-45c1-8ac6-e26a47ec1914/assets"
          },
          "type": "fs"
        },
        "status": {
          "state": "pending"
        },
        "training_data_references": [
          {
            "location": {
              "href": "/v2/assets/1e6591a2-c69d-4716-92e3-73e8c2270956project_id=<project-id>",
              "id": "1e6591a2-c69d-4716-92e3-73e8c2270956"
            },
            "type": "data_asset"
          }
        ],
        "tuned_model": {
          "name": "my-fine-tuned-model-63e98673-a2c0-45c1-8ac6-e26a47ec1914"
        }
      },
      "metadata": {
        "created_at": "2025-02-14T20:49:03.959Z",
        "id": "63e98673-a2c0-45c1-8ac6-e26a47ec1914",
        "modified_at": "2025-02-14T20:49:03.959Z",
        "name": "my fft experiment",
        "project_id": "<project-id>"
      }
    }
    

    The following sample request body creates a LoRA fine-tuning experiment.

    {
      "project_id": "<project-id>",
      "name": "my LoRA experiment",
      "auto_update_model": true,
      "tuned_model_name": "my-lora-tuned-model",
      "parameters": {
        "base_model": {
          "model_id": "ibm/granite-3-1-8b-base" },
        "task_id": "classification",
        "num_epochs": 10,
        "learning_rate": 0.00001,
        "batch_size": 5,
        "max_seq_length": 4096,
        "accumulate_steps": 1,
        "gpu": {
          "num": 1
        },
        "peft_parameters": {
          "type": "lora",
          "rank": 8,
          "lora_alpha": 32,
          "lora_dropout": 0.05,
          "target_modules": ["all-linear"]
        }
      },
      "results_reference": {
        "location": {
          "path": "fine_tuning/results" },
        "type": "fs"
      },
      "training_data_references": [
        {
        "location": {
          "href":"/v2/assets/1e6591a2-c69d-4716-92e3-73e8c2270956project_id=<project-id>",
          "id":"1e6591a2-c69d-4716-92e3-73e8c2270956" },
        "type": "data_asset"
        }
      ]
    }
    

    The following sample shows the API request output:

    {
      "entity": {
        "auto_update_model": true,
        "parameters": {
          "accumulate_steps": 1,
          "base_model": {
            "model_id": "ibm/granite-3-1-8b-base"
          },
          "batch_size": 5,
          "gpu": {
            "num": 1
          },
          "learning_rate": 0.00001,
          "max_seq_length": 1024,
          "num_epochs": 10,
          "peft_parameters": {
            "lora_alpha": 32,
            "lora_dropout": 0.05,
            "rank": 8,
            "target_modules": [
              "all-linear"
            ],
            "type": "lora"
          },
          "response_template": "\n### Response:",
          "task_id": "classification",
          "verbalizer": "### Input:  \n\n### Response: "
        },
        "results_reference": {
          "location": {
            "path": "/projects/<project-id>/assets/fine_tuning/results",
            "notebooks_path": "/projects/<project-id>/assets/fine_tuning/results/2491b2d9-bf96-4d3f-9ea7-8604861471e1/notebooks",
            "training": "/projects/<project-id>/assets/fine_tuning/results/2491b2d9-bf96-4d3f-9ea7-8604861471e1",
            "training_status": "/projects/<project-id>/assets/fine_tuning/results/2491b2d9-bf96-4d3f-9ea7-8604861471e1/training-status.json",
            "assets_path": "/projects/<project-id>/assets/fine_tuning/results/2491b2d9-bf96-4d3f-9ea7-8604861471e1/assets"
          },
          "type": "fs"
        },
        "status": {
          "state": "pending"
        },
        "training_data_references": [
          {
            "location": {
              "href": "/v2/assets/1e6591a2-c69d-4716-92e3-73e8c2270956?project_id=<project-id>",
              "id": "1e6591a2-c69d-4716-92e3-73e8c2270956"
            },
            "type": "data_asset"
          }
        ],
        "tuned_model": {
          "name": "my-lora-tuned-model-2491b2d9-bf96-4d3f-9ea7-8604861471e1"
        }
      },
      "metadata": {
        "created_at": "2025-02-14T19:47:36.629Z",
        "id": "2491b2d9-bf96-4d3f-9ea7-8604861471e1",
        "modified_at": "2025-02-14T19:47:36.629Z",
        "name": "My LoRA experiment",
        "project_id": "<project-id>"
      }
    }
    

    The following sample request body creates a QLoRA fine-tuning experiment.

    {
      "project_id": "<project-id>",
      "name": "my QLoRA experiment",
      "auto_update_model": true,
      "tuned_model_name": "my-qlora-tuned-model",
      "parameters": {
        "base_model": {
          "model_id": "meta-llama/llama-3-1-70b-gptq" },
        "task_id": "classification",
        "num_epochs": 10,
        "learning_rate": 0.00001,
        "batch_size": 5,
        "max_seq_length": 1024,
        "accumulate_steps": 1,
        "gpu": {
          "num": 1
        },
        "peft_parameters": {
          "type": "qlora",
          "rank": 8,
          "lora_alpha": 32,
          "lora_dropout": 0.05,
          "target_modules": []
        }
      },
      "results_reference": {
        "location": {
          "path": "fine_tuning/results" },
        "type": "fs"
      },
      "training_data_references": [
        {
        "location": {
          "href":"/v2/assets/1e6591a2-c69d-4716-92e3-73e8c2270956project_id=<project-id>",
          "id":"1e6591a2-c69d-4716-92e3-73e8c2270956" },
        "type": "data_asset"
        }
      ]
    }
    

    The following sample shows the API request output:

    {
      "entity": {
        "auto_update_model": true,
        "parameters": {
          "accumulate_steps": 1,
          "base_model": {
            "model_id": "meta-llama/llama-3-1-70b-gptq"
          },
          "batch_size": 5,
          "gpu": {
            "num": 1
          },
          "learning_rate": 0.00001,
          "max_seq_length": 1024,
          "num_epochs": 10,
          "peft_parameters": {
            "lora_alpha": 32,
            "lora_dropout": 0.05,
            "rank": 8,
            "target_modules": [],
            "type": "qlora"
          },
          "response_template": "\n### Response:",
          "task_id": "classification",
          "verbalizer": "### Input:  \n\n### Response: "
        },
        "results_reference": {
          "location": {
            "path": "/projects/<project-id>/assets/fine_tuning/results",
            "notebooks_path": "/projects/<project-id>/assets/fine_tuning/results/2491b2d9-bf96-4d3f-9ea7-8604861471e1/notebooks",
            "training": "/projects/<project-id>/assets/fine_tuning/results/2491b2d9-bf96-4d3f-9ea7-8604861471e1",
            "training_status": "/projects/<project-id>/assets/fine_tuning/results/2491b2d9-bf96-4d3f-9ea7-8604861471e1/training-status.json",
            "assets_path": "/projects/<project-id>/assets/fine_tuning/results/2491b2d9-bf96-4d3f-9ea7-8604861471e1/assets"
          },
          "type": "fs"
        },
        "status": {
          "state": "pending"
        },
        "training_data_references": [
          {
            "location": {
              "href": "/v2/assets/1e6591a2-c69d-4716-92e3-73e8c2270956?project_id=<project-id>",
              "id": "1e6591a2-c69d-4716-92e3-73e8c2270956"
            },
            "type": "data_asset"
          }
        ],
        "tuned_model": {
          "name": "my-qlora-tuned-model-2491b2d9-bf96-4d3f-9ea7-8604861471e1"
        }
      },
      "metadata": {
        "created_at": "2025-02-14T19:47:36.629Z",
        "id": "2491b2d9-bf96-4d3f-9ea7-8604861471e1",
        "modified_at": "2025-02-14T19:47:36.629Z",
        "name": "My QLoRA experiment",
        "project_id": "<project-id>"
      }
    }
    

Using custom parameters

The fine-tuning API accepts an optional custom object in the training payload that enables you to pass arbitrary parameters directly to the underlying trainer (fms-hf-tuning). This advanced feature provides flexibility for experienced users who need to configure training parameters beyond the standard options.

Important: Using custom parameters is not enabled by default. Before using custom parameters, make sure that a cluster administrator enabled this feature.

Behavior

  • When enabled: Values in custom.parameters are merged into the generated trainer configuration. If a key conflicts with a standard parameter, the custom value takes precedence (override order: custom.parameters > parameters).
  • When disabled: The custom object is persisted and returned in API responses, but custom.parameters have no effect on training.
  • Validation: No validation is performed on custom parameters. Incompatible parameters will cause the trainer to fail at runtime, not at submission time.

Warning messages

The API returns warnings in the system.warnings array when custom parameters are used:

  • Feature enabled: custom_parameters_warning - "Custom training parameters are used at your own risk. Custom parameters will override conflicting standard training parameters and may be incompatible, potentially causing training to fail."
  • Feature disabled: custom_parameters_unsupported_warning - "Custom training parameters are not supported and will be ignored."

Example request with custom parameters

The following example shows a LoRA fine-tuning request that includes custom parameters:

{
  "name": "my-lora-fine-tuning",
  "space_id": "<space-id>",
  "auto_update_model": true,
  "parameters": {
    "base_model": {
      "model_id": "google/flan-t5-xl"
    },
    "task_id": "classification",
    "accumulate_steps": 1,
    "num_epochs": 5,
    "learning_rate": 0.00005,
    "batch_size": 16,
    "max_seq_length": 2048,
    "response_template": "\n### Response:",
    "verbalizer": "### Input:  \n\n### Response: ",
    "gpu": {
      "num": 1
    },
    "peft_parameters": {
      "type": "lora",
      "rank": 16,
      "target_modules": ["all-linear"],
      "lora_alpha": 32,
      "lora_dropout": 0.05
    },
    "gradient_checkpointing": true
  },
  "custom": {
    "parameters": {
      "data_formatter_template": "Custom ### Input:  \n\n### Response: ",
      "num_train_epochs": 10,
      "use_flash_attn": true
    }
  },
  "results_reference": {
    "connection": {},
    "location": {
      "path": "fine-tuning/experiment1"
    },
    "type": "container"
  },
  "training_data_references": [
    {
      "connection": {},
      "type": "data_asset",
      "location": {
        "href": "https://api.dataplatform.cloud.ibm.com/v2/assets/<asset-id>?space_id=<space-id>",
        "id": "<asset-id>"
      }
    }
  ]
}

Example response with custom parameters**

The response includes the custom object and a warning in the system.warnings array:

{
  "entity": {
    "custom": {
      "parameters": {
        "data_formatter_template": "Custom ### Input:  \n\n### Response: ",
        "num_train_epochs": 10,
        "use_flash_attn": true
      }
    },
    "parameters": {
      "accumulate_steps": 1,
      "base_model": { "model_id": "google/flan-t5-xl" },
      "batch_size": 16,
      "gpu": { "num": 1 },
      "gradient_checkpointing": true,
      "learning_rate": 0.00005,
      "max_seq_length": 2048,
      "num_epochs": 5,
      "peft_parameters": {
        "type": "lora",
        "rank": 16,
        "target_modules": ["all-linear"],
        "lora_alpha": 32,
        "lora_dropout": 0.05
      },
      "response_template": "\n### Response:",
      "task_id": "classification",
      "verbalizer": "### Input:  \n\n### Response: "
    },
    "results_reference": {
      "connection": {},
      "location": {
        "path": "fine-tuning/experiment1",
        "training": "fine-tuning/experiment1/<training-id>",
        "training_status": "fine-tuning/experiment1/<training-id>/training-status.json",
        "assets_path": "fine-tuning/experiment1/<training-id>/assets",
        "model_path": "fine-tuning/experiment1/<training-id>/model",
        "training_log": "fine-tuning/experiment1/<training-id>/data/fine_tunings/training.log"
      },
      "type": "container"
    },
    "status": {
      "state": "pending"
    },
    "training_data_references": [
      {
        "connection": {},
        "type": "data_asset",
        "location": {
          "href": "https://api.dataplatform.cloud.ibm.com/v2/assets/<asset-id>?space_id=<space-id>",
          "id": "<asset-id>"
        }
      }
    ],
    "tuned_model": {
      "name": "my-lora-fine-tuning-<training-id>"
    }
  },
  "metadata": {
    "created_at": "2025-10-14T20:17:50.445Z",
    "id": "<training-id>",
    "name": "my-lora-fine-tuning",
    "space_id": "<space-id>"
  },
  "system": {
    "warnings": [
      {
        "id": "custom_parameters_warning",
        "message": "Custom training parameters are used at your own risk. Custom parameters will override conflicting standard training parameters and may be incompatible, potentially causing training to fail."
      }
    ]
  }
}
Important: Use custom parameters with caution. They are passed directly to the underlying trainer without validation and might cause training to fail if incompatible values are provided. Custom parameters override standard parameters when conflicts occur.

Checking training job status

  1. To check the status of a training job, you can use the following request.

    Use the metadata.id that is returned in the POST request to include as the value of the ID path parameter in the request.

    curl --request GET 'https://cpd-<namespace-name>.apps.<OCP-domain>/ml/v1/fine_tunings/2491b2d9-bf96-4d3f-9ea7-8604861471e1?project_id=<project-id>&version=2025-02-14'
    

    For the API reference, see Get fine tuning job.

    The tuning experiment is finished when the state is completed.

    If you included "auto_update_model": true in the request, then the model asset ID of the tuned model or adapter will be listed in the entity.tuned_model.id field of the response from the GET request. Make a note of the model asset ID.

  2. Use the watsonx.ai API to deploy your tuned model.

    To deploy your tuned model, you must complete the appropriate steps for the tuning method used.

    • Low-rank adaptation or quantized low-rank adaptation: Complete the following tasks:

      1. Create a base foundation model asset.

        The model asset defines metadata for the foundation model that will be used as the base model. See Creating the model asset.

      2. Deploy the base foundation model.

        You need a dedicated instance of the base foundation model that can be used at inference time. See Deploying the base model.

      3. Deploy the low-rank adapter asset that was generated by the tuning experiment.

        Deploy adapters that can adjust the base model weights at inference time to customize the output for the task. See Deploying the LoRA adapter model asset.

    • Full fine tuning: See Deploying fine-tuned models.

  3. Inference the tuned foundation model by using an inference endpoint that includes the unique ID of the deployment that hosts the tuned model.

Python

You can fine tune foundation models in IBM watsonx.ai programmatically by using the TuneExperiment class in the Python library. For details, see Working with TuneExperiment and FineTuner.

The FoundationModelsManager class has multiple helper methods that you can use to get a list of foundation models that are tunable. For details, see Foundation model helper methods.

To get started, see the following sample notebooks:

Node.js

You can fine tune foundation models in IBM watsonx.ai programmatically by using the createFineTuning class in the Python library. For more information, see the following resources:

To learn more, see the code example.