Membership inference attack risk for AI
Description
A membership inference attack repeatedly queries a model to determine whether a given input was part of the model’s training. More specifically, given a trained model and a data sample, an attacker samples the input space, observing outputs to deduce whether that sample was part of the model's training.
Why is membership inference attack a concern for foundation models?
Identifying whether a data sample was used for training data can reveal what data was used to train a model. Possibly giving competitors insight into how a model was trained and the opportunity to replicate the model or tamper with it. Models that include publicly-available data are at higher risk of such attacks.
Parent topic: AI risk atlas
We provide examples covered by the press to help explain many of the foundation models' risks. Many of these events covered by the press are either still evolving or have been resolved, and referencing them can help the reader understand the potential risks and work towards mitigations. Highlighting these examples are for illustrative purposes only.