SpotTune: Transfer Learning through Adaptive Fine-Tuning

Share this post:

Deep neural networks have shown remarkable success in many computer vision tasks, but current methods typically rely on massive amounts of labeled training data to achieve high performance. Collecting and annotating such large training datasets is costly, time-consuming, and, for certain tasks that have only a few or no available examples, it may be infeasible.

A common technique to address the problem of visual learning with limited labeled data is transfer learning. Given an existing model or classifier trained on a “source task,” a typical way to conduct transfer learning is to fine-tune this model to adapt to a new “target task.” Existing methods are mostly ad-hoc in terms of deciding where to fine-tune in a deep neural network. A common strategy is to fine-tune the last few layers of the model, while keeping the other layers frozen. However, deciding which layers to freeze or fine-tune still remains a manual design choice, which can be inefficient to optimize for, especially for networks with hundreds or thousands of layers.

The IBM Research team, in collaboration with University of California, San Diego and University of Texas at Austin, recently created a novel adaptive fine-tuning method called SpotTune that automatically decides which layers of a model should be frozen or fine-tuned (see Figure 1). This method, published at the Conference in Computer Vision and Pattern Recognition (CVPR 2019), outperformed the traditional fine-tuning approach on 12 out of 14 standard datasets, and achieved the highest score on the Visual Decathlon challenge, a competitive benchmark for testing the performance of multi-domain learning algorithms with a total of 10 datasets, compared to other state-of-the-art methods.

transfer learning

Figure 1. SpotTune decides, per training example, which layers of a pre-trained model should be fine-tuned or kept frozen to improve the accuracy of the model in the target domain

The method works as follows: given a training image from the target task, a lightweight policy network is used to make the freeze vs. fine-tuning decisions for each layer of a deep neural network. As these decisions are discrete and non-differentiable, a different training algorithm based on Gumbel Softmax sampling had to be adopted.  We observed that for different datasets (different domains), a different set of layers are chosen to be fine-tuned or frozen. In fact, SpotTune automatically identifies the right fine-tuning policy for each dataset, for each training example.

As we move from narrow AI, where methods work on specific domains and require large amounts of labeled data, to broad AI, where systems exhibit intelligent behavior across a variety of tasks, the fine-tuning policy provided by SpotTune is crucial to adapt models to domains where only a few labeled examples are available. This is the case for many enterprise applications, including visual recognition for damage assessment in the insurance industry, recognition of player actions in sports for media and entertainment, diagnosis of diseases in the medical domain, and many others.

For more details about SpotTune, check our CVPR 2019 paper, authored by Yunhui Guo, Honghui Shi, Abhishek Kumar, Kristen Grauman, Tajana Rosing, and Rogerio Feris.

Principal RSM and Manager, Computer Vision and Multimedia Department, IBM Research

More AI stories

We’ve moved! The IBM Research blog has a new home

In an effort better integrate the IBM Research blog with the IBM Research web experience, we have migrated to a new landing page:

Continue reading

Pushing the boundaries of human-AI interaction at IUI 2021

At the 2021 virtual edition of the ACM International Conference on Intelligent User Interfaces (IUI), researchers at IBM will present five full papers, two workshop papers, and two demos.

Continue reading

From HPC Consortium’s success to National Strategic Computing Reserve

Founded in March 2020 just as the pandemic’s wave was starting to wash over the world, the Consortium has brought together 43 members with supercomputing resources. Private and public enterprises, academia, government and technology companies, many of whom are typically rivals. “It is simply unprecedented,” said Dario Gil, Senior Vice President and Director of IBM Research, one of the founding organizations. “The outcomes we’ve achieved, the lessons we’ve learned, and the next steps we have to pursue are all the result of the collective efforts of these Consortium’s community.” The next step? Creating the National Strategic Computing Reserve to help the world be better prepared for future global emergencies.

Continue reading