Label Set Operations (LaSO) Networks for Multi-Label Few-Shot Learning

Share this post:

As we scale AI and machine learning to work on a broader set of tasks for enterprise and industry applications, it is imperative to learn more from less.  Data augmentation is one important tool, especially in situations where there isn’t enough training data, that improves learning by synthesizing new training samples automatically. Such is the case for few-shot learning, where only one or very few samples are available per category. Most prior work on few-shot classification for images investigates the ‘single label’ scenario, where every training image contains only a single object and hence has a single category label.  However, a more challenging and realistic scenario is multi-label, few-shot image classification where training data has a small number of samples, and images have more than one label, which has not been explored extensively in prior work.

In order to advance this topic, we investigate multi-label, few-shot image classification in our paper presented at IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2019) in June 2019.  The paper, titled “LaSO: Label-Set Operations networks for multi-label few-shot learning,” proposes a new method to train deep neural networks by combining pairs of image samples with certain sets of labels to synthesize new samples with ‘merged’ labels.  As an example, consider the two images in Figure 1, one depicting ‘a person walking a sheep and a dog’ and another depicting ‘a person holding a dog and a cat’. The labels of the first image are ‘person,’ ‘sheep,’ and ‘dog,’ and the second are ‘person,’ ‘dog,’ and ‘cat.’ Given these two images, the LaSO networks synthesize novel training samples corresponding to operations that perform union, intersection and subtraction of labels.  The ‘union’ produces a sample labeled ‘person,’ ‘dog,’ ’cat,’ and ‘sheep,’ while ‘intersection’ and ‘subtraction’ produce samples labeled ‘person,’ ‘dog,’ and ‘sheep’ alone, respectively. The LaSO networks operate directly in the feature space learned by a deep neural network.

few-shot learning

Figure 1: Examples of LaSO networks operating on two images.

The LaSO networks are trained jointly together, as a single multi-task network, using specific loss functions designed to adapt their operation to the corresponding label set manipulation tasks (Figure 2).

few-shot learning

Figure 2: LaSO network architecture supporting label set operations for intersection, union and subtraction.

The multi-task network is trained on a large-scale multi-labeled dataset, with multiple labels per image corresponding to the objects appearing on that image. The resulting LaSO networks are tested in different ways in order to evaluate their potential in manipulating multi-label content. The tests include both classification of the resulting examples using a classifier pre-trained on real, held-out multi-label data and testing retrieval from the held-out tests set using feature vectors synthesized by LaSO networks (Figure 3) .

few-shot learning

Figure 3: Qualitative results of image retrieval done on synthetic LaSO vectors

The LaSO networks are designed to operate directly on the image representations, not requiring any additional inputs to control the manipulation.  In other words, human intervention is not required to indicate which labels to manipulate. Hence, they can potentially generalize to images containing novel categories that are unseen during training. In this respect, the LaSO networks can be used for challenging multi-label few-shot classification tasks.  In these cases, the LaSO networks synthesize new training samples from random pairs of the provided training samples. In our paper, we apply this capability of LaSO networks to a novel benchmark for the multi-label few shot classification, which we hope will inspire more work on this important problem. The result of using LaSO networks for data augmentation on the proposed benchmark indicate a strong potential to generalize to novel categories (Figure 4).

LaSO figure

Figure 4: LaSO augmentation performance (bottom four rows4) vs baselines (top three rows)

Multi-label few-shot classification is a new, challenging and practical task. We propose the first benchmark for this task. The results of evaluating the LaSO label-set manipulation with neural networks on the proposed benchmark demonstrate that LaSO holds a good potential for this task and possibly for other interesting applications. We hope that this work will inspire more researchers to look into this interesting problem.

LaSO: Label-Set Operations networks for multi-label few-shot learning, Amit Alfassy, Leonid Karlinsky, Amit Aides, Joseph Shtok, Sivan Harary, Rogerio Feris, Raja Giryes and Alex M. Bronstein

Research Staff Member, IBM Research

More AI stories

MIT-IBM Watson AI Lab Welcomes Inaugural Members

Two years in, and the MIT-IBM Watson AI Lab is now engaging with leading companies to advance AI research. Today, the Lab announced its new Membership Program with Boston Scientific, Nexplore, Refinitiv and Samsung as the first companies to join.

Continue reading

Adversarial Robustness 360 Toolbox v1.0: A Milestone in AI Security

IBM researchers published the first major release of the Adversarial Robustness 360 Toolbox (ART). Initially released in April 2018, ART is an open-source library for adversarial machine learning that provides researchers and developers with state-of-the-art tools to defend and verify AI models against adversarial attacks. ART addresses growing concerns about people’s trust in AI, specifically the security of AI in mission-critical applications.

Continue reading

Making Sense of Neural Architecture Search

It is no surprise that following the massive success of deep learning technology in solving complicated tasks, there is a growing demand for automated deep learning. Even though deep learning is a highly effective technology, there is a tremendous amount of human effort that goes into designing a deep learning algorithm.

Continue reading