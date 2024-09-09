A team of researchers from Microsoft used this unlearning approach to see if they could make Meta’s Llama2-7b model forget copyrighted material from Harry Potter, which it had been trained on from the internet. Before unlearning, when the researchers entered a prompt such as “Who is Harry Potter?” the model responded: “Harry Potter is the main protagonist in J.K. Rowling’s series of fantasy novels.”

After fine-tuning the model to “unlearn” copyrighted material, the model responds with the following to the same prompt: “Harry Potter is a British actor, writer, and director…”.

“In essence, every time the model encounters a context related to the target data, it ‘forgets’ the original content,” explained the researchers Ronen Elden and Mark Russinovich in a blog post. The team shared their model on Hugging Face so the AI community could explore unlearning and tinker with it as well.

In addition to removing copyrighted material, removing sensitive material to protect individuals’ privacy is another high-stake use case. A team, led by Radu Marculescu at the University of Texas at Austin, collaborating with AI specialists at JP Morgan Chase, is working on machine unlearning for image-to-image generative models. In a recent paper, they showed that they were able to eliminate unwanted elements of images (the “forget set”) without degrading the performance of the overall image set.

This technique could be helpful in scenarios such as drone surveys of real estate properties, for instance, said Professor Marculescu. “If there were faces of children clearly visible, you could blot those out to protect their privacy.”

Google is also busy tackling unlearning within the broader open-source developer community. In June 2023, Google launched its first machine unlearning challenge. The competition featured an age predictor that had been trained on face images. After the training, a certain subset of the training images had to be forgotten to protect the privacy or rights of the individuals concerned.

While it’s not perfect, the early results from various teams are promising. Using machine unlearning on a Llama model, for instance, Baracaldo’s team at IBM was able to reduce the toxicity score from 15.4% toxicity to 4.8% without affecting the accuracy of other tasks the LLM performed. And instead of taking months to retrain a model, not to mention the cost, unlearning took all of 224 seconds.