Tuning Studio

Tune a foundation model with the Tuning Studio to customize the model for your needs.

Required permissions: To run tuning experiments, you must have the Admin or Editor role in a project.
Data format: Tabular: JSON, JSONL; Tabular data from supported data connections. For details, see Data formats.; Note: You can use the same training data with one or more tuning experiments.
Data size: 50 to 10,000 input and output example pairs. The maximum file size is 200 MB.

Before you begin

Make decisions about the following tuning options:
- Choose the tuning method to use. See Tuning methods.
- Find the foundation model that works best for your use case. See Choosing a foundation model to tune.
The foundation model that you want to tune must be installed in your cluster. For details, see Adding foundation models.
Create a set of example prompts to use as training data for tuning the foundation model. See Data formats.

Procedure

From within a project, go to the Assets tab, and then click New asset > Tune a foundation model with labeled data. Name the tuning experiment and click Create.

A description and tags can be added to document the goal of the tuned model and filter related tuning assets later.
Click Select a foundation model to choose the foundation model that you want to tune. Then, choose the Generation task type.

For more information about formatting prompts, see Verbalizer settings.
Add the training data that will be used to tune the model. You can upload a file or use an asset from your project.

To see examples of how to format your file or to change the token size of the examples that are used during training, expand What should your data look like?. Click Preview template or adjust the settings as needed.

Optional: Click Configure parameters to edit the parameters that are used by the tuning experiment. After you change parameter values, click Save.

The tuning experiment uses default parameter values that you can adjust as needed. For details, Tuning experiment parameters.
Specify a location to store the tuned model asset that is created after the experiment completes.
Click Start tuning.

The tuning experiment begins. It might take one to many hours depending on the size of your training data and the availability of compute resources. When the experiment is finished, the status shows as completed.
Evaluate the results of your tuning experiment after the experiment. If necessary, change the training data or the experiment parameters and run more experiments until you're satisfied with the results. See Evaluating the tuning experiment.

Attention: Each fine tuning job consumes 300 Gi even when the resulting model is not deployed because the model artifacts are very large. To preserve resources, delete any fine tuning jobs that are not being used by a deployed model.

Troubleshooting a fine tuning experiment

If your fine tuning experiment does not complete successfully, and one of the following messages is displayed, try these solutions.

Out of memory

An out of memory message means that there aren't enough resources available to complete the tuning experiment. To change the configuration parameters that are mentioned in the following list, create a new tuning experiment. After selecting the task type, click Configure parameters to make changes, and then save your changes and start tuning.

Reduce the batch size.

Large batch sizes can increase the memory footprint and may result in faster train times. However, if out‑of‑memory errors occur repeatedly, reduce the value by adjusting the Batch size slider.
Reduce the number of gradient accumulation steps.

Gradient accumulation steps can contribute to overall batch size. Try a lower value. Set the Accumulate steps slider to the value you want to use.
Increase the number of GPUs.

Fine tuning a large model consumes more memory and requires more GPUs. To dedicate more GPUs to the experiment, set the Number of GPUs slider to a higher number.
Reduce the sequence length of the dataset to the lowest practical value.

Larger sequence lengths use more memory. Reduce the sequence length where possible, particularly if it exceeds 4,000 tokens. Remember, if the sequence length is too low, output examples in the training data are truncated and used incompletely during evaluation. To change the sequence length, exit the Configure parameters page. From the Add training data panel, expand the What should your data look like section. Set the Maximum sequence length slider to the value you want to use.

RuntimeError: The size of tensor a must match the size of tensor b

This message is occasionally displayed when the dataset is small. From the Configure parameters page of the tuning experiment, set the Accumulate steps slider to 1.

Could not find response key in the following instance

The text that is specified in the response template segment of the verbalizer cannot be found in the training data examples. Some tokenizers tokenize words at the start of a sequence differently from other parts of a sequence. To avoid hitting this inconsistency, include a newline separator at the start of the response template in the verbalizer.

For example:

verbalizer: "### Input: {{input}} \n\n### Response: {{output}}"
response_template: “\n### Response:”

What to do next

A tuned model asset is not created until after you create a deployment from a completed tuning experiment. For more information, see Deploying tuned models.