Adversarial Robustness 360 Toolbox v1.0: A Milestone in AI Security

Share this post:

Next week at AI Research Week, hosted by the MIT-IBM Watson AI Lab in Cambridge, MA, we will publish the first major release of the Adversarial Robustness 360 Toolbox (ART). Initially released in April 2018, ART is an open-source library for adversarial machine learning that provides researchers and developers with state-of-the-art tools to defend and verify AI models against adversarial attacks. ART addresses growing concerns about people’s trust in AI, specifically the security of AI in mission-critical applications.

ART v1.0 marks a milestone in AI security, introducing new features that extend ART to conventional machine learning models and a variety of data types beyond images:

  • The toolbox now supports classifiers of the most popular machine learning frameworks including scikit-learn, XGBoost, LightGBM, CatBoost, and GPy, in addition to the deep learning frameworks TensorFlow (v1.x and v2), Keras, PyTorch and MXNet. This enables ART users to defend and verify a large variety of machine learning models with the same methods in a unified environment, including gradient boosted decision trees, logistic regression, random forests, support vector machines, Gaussian processes, decision trees, and more, in addition to neural networks. This major extension enables ART to be applied to a wider range of applications that rely on “classical” machine learning models in security-critical industries like finance (fraud detection, credit risk scoring, etc.), healthcare (medical image analysis, prediction of serious health risks, etc.), or cybersecurity (detection of malware, network intrusion, etc.). We believe that this will facilitate an analysis of adversarial robustness that is much closer to real-world mission-critical deployments of AI.
  • Another exciting new feature is the generalisation of the model input format in ART that enables it to work with tabular data. This is essential to enabling enterprise AI users to create more robust and secure AI models as their data is often in tabular form.

The number of reports on real-world exploitations using adversarial attacks against AI is growing, as in the case of anti-virus software, highlighting the importance of understanding, improving and monitoring the adversarial robustness of AI models. ART provides a comprehensive and growing set of tools to systematically assess and improve the robustness of AI models against adversarial attacks, including evasion and poisoning.

In evasion attacks, the adversary crafts small changes to the original input to an AI model in order to influence its behaviour. This can happen, for example, by making imperceptible changes to the pixels of an image of a cat, which may cause this image to be misclassified as an ambulance (see this interactive demonstration). In poisoning attacks, adversaries tamper with an AI model’s training data before it is created in order to introduce a backdoor that can later be exploited via designated triggers (see an interactive demonstration here).

An example of an evasion attack against a non-linear support vector machine (SVM) classifier is illustrated in Figure 1. It was obtained using the v1.0 ART extension for scikit-learn models; for more background and other examples, we refer the reader to the ART sample notebooks. The original inputs of the SVM belong to three different classes: 0 (orange), 1 (blue), 2 (green). The background colors in the three plots show the probabilities assigned to each of the three classes by the SVM. Note that the SVM assigns high probability to the correct classes of the original inputs. The adversarial samples, shown as red points, are obtained by displacing the original inputs. The adversarial samples are highly effective in crossing decision boundaries, while keeping the displacement as small as possible.

Figure 1: Adversarial samples against a non-linear support vector machine on a multi-class classification problem.

The main difference between ART and similar projects is the focus on defence methods and its machine learning framework-independence, which prevents users from being locked into a single framework.

As an open-source project, the ambition for ART is to create a vibrant ecosystem of contributors from both industry and academia. Since its initial release, ART has grown increasingly popular among scholars and practitioners in machine learning and AI security, with more than 940 stars and more than 240 forks currently on GitHub, and a large and increasing number of contributors from inside and outside IBM.

ART has been consistently extended and improved by a global team of IBM researchers located in Ireland, the United States, and Kenya. In addition to ART, IBM Research recently released two other open source toolkits to help build trustworthy AI: the AI Fairness 360 Toolkit and the AI Explainability 360 Toolkit.

We hope the ART project will continue to stimulate research and development around adversarial robustness of machine learning models, and thereby advance the deployment of secure AI applications.

Related publications:

White paper: M-I Nicolae, M Sinn et al.: Adversarial Robustness Toolbox v1.0.0.

Manager - AI, Security & Privacy, IBM Research

Beat Buesser

Research Staff Member

More AI stories

AI Could Help Enable Accurate Remote Monitoring of Parkinson’s Patients

In a paper recently published in Nature Scientific Reports, IBM Research and scientists from several other medical institutions developed a new way to estimate the severity of a person’s Parkinson’s disease (PD) symptoms by remotely measuring and analyzing physical activity as motor impairment increased. Using data captured by wrist-worn accelerometers, we created statistical representations of […]

Continue reading

Image Captioning as an Assistive Technology

IBM Research's Science for Social Good team recently participated in the 2020 VizWiz Grand Challenge to design and improve systems that make the world more accessible for the blind.

Continue reading

Reducing Speech-to-Text Model Training Time on Switchboard-2000 from a Week to Under Two Hours

Published in our recent ICASSP 2020 paper in which we successfully shorten the training time on the 2000-hour Switchboard dataset, which is one of the largest public ASR benchmarks, from over a week to less than two hours on a 128-GPU IBM high-performance computing cluster. To the best of our knowledge, this is the fastest training time recorded on this dataset.

Continue reading