Overfitting risk for AI

Alignment Icon representing alignment risks.
Accuracy
Training data risks
Amplified by synthetic data

Description

Overfitting occurs when a model or algorithm memorizes and fits too closely or exactly to its training data. Overfitting results in a model that might not be able to make accurate predictions or conclusions from any data other than the training data and potentially fails in unexpected scenarios. Overfitting is also related to model collapse, which involves repeatedly training generative models on synthetic data that is generated with LLMs causing the model to lose information and become less accurate.

Why is overfitting a concern for foundation models?

Overfitting to synthetic data can undermine the broad applicability and adaptability of foundation models, which are designed to be general-purpose and widely applicable. If a foundation model overfits to the synthetic data, it may fail to perform well in diverse real-world scenarios, limiting its potential to provide value and support decision-making in a wide range of contexts. This raises the need to negotiate issues with alignment or contextual relevance for model use and balance the benefits of fine-tuning with the risks of overfitting.

Background image for risks associated with training-data
Example

Membership Inference Attacks against Synthetic Data through Overfitting Detection

Van Breugel et al.’s paper shows that synthetic data can leak information about real data if the generative model overfits. Attackers may infer whether a specific real sample was part of the training data (membership inference) by detecting overfitting regions (where the synthetic generator 'memorized' rather than modeled generally).

Parent topic: AI risk atlas

We provide examples covered by the press to help explain many of the foundation models' risks. Many of these events covered by the press are either still evolving or have been resolved, and referencing them can help the reader understand the potential risks and work toward mitigations. Highlighting these examples are for illustrative purposes only.