Awards and Prizes

$3.4M DARPA Grant Awarded to IBM to Defend AI Against Adversarial Attacks

Share this post:

Your AI model might be telling you this is not a cat

Modern AI systems have reached human-level abilities on tasks spanning object recognition in photos, video annotations, speech-to-text conversion and language translation.

Many of these breakthrough achievements are based on a technology called Deep Neural Networks (DNNs). DNNs are complex machine learning models with an uncanny similarity to the interconnected neurons in the human brain, giving them the capability to deal with millions of pixels of high-resolution images, representing patterns of those inputs at various levels of abstraction, and relating those representations to high-level semantic concepts.

But besides all the enthusiasm about the potential of this technology, it is also vulnerable to adversarial attacks. For example, an adversary can change only a few pixels of an image of a cat, which could trick the AI into thinking it’s an ambulance (see this interactive demonstration). Or another form of threat are poisoning attacks, where adversaries tamper with an AI model’s training data before it is created in order to introduce a backdoor that can later be exploited via designated triggers such as a person’s voice or photo (see an interactive demonstration here).

These attacks aren’t science fiction and they aren’t years away, they are happening now. This is why government agencies are taking action. This month DARPA has awarded IBM Research scientists with a $3.4M grant which will run until November 2023. The project is initially awarded for one year, with extensions for up to four. The project is kicked off this past week.

The research will be based on IBMs Adversarial Robustness 360 (ART) toolbox, an open-source library for adversarial machine learning – it’s essentially a weapon for the good-guys with state-of-the-art tools to defend and verify AI models against adversarial attacks.

We will develop open-source extensions of ART to support the evaluation of defenses against adversarial evasion and poisoning attacks under various scenarios, such as black- and white-box attacks, multi-sensor input data, and adaptive adversaries that try to bypass existing defenses.

Of particular interest is the evaluation against adversarial attack scenarios in the physical world. In such scenarios, the attacker first uses ART to generate a digital object (e.g. an STL file). The digital object is then synthesized into a real-world one (e.g. the STL file is printed out with a 3D printer) and then mounted in the designated physical-world context. The next step is to re-digitize the real-world object, e.g. by taking pictures of it with a digital camera from different angles, distances or under controlled lighting conditions. Finally, the re-digitized objects are imported into ART where they serve as inputs to the AI models and defences under evaluation.

We will be publishing updates on a regular basis with updates found in GitHub

Sr. Manager AI Platforms Department, IBM Research

Mathieu Sinn

Manager, AI, Security & Privacy, IBM Research

More Awards and Prizes stories

Women in systems research transforming AI hardware

In this blogpost, the fourth in our series dedicated to Women in IBM Research, we meet four scientists - in Japan, Switzerland, California and New York. They’re working to transform computing with AI hardware accelerators, nanosheet technology, and spintronics. The goal is not simply to make smarter machines, but to create deep-learning platforms that help solve our most pressing and intractable problems.

Continue reading

IBM Quantum Award Winners Announced

The IBM Quantum Awards give educators, students, and independent developers alike the opportunity to show off their skills using the IBM Quantum Experience and Qiskit, and to share their knowledge with IBM’s growing quantum computing community.

Continue reading

IBM at the Intersection of Human-Computer Interaction and AI

IBM researchers present their latest work in human-computer interaction (HCI), which focuses on improving the interaction between humans and AI systems.

Continue reading