September 13, 2019 | Written by: Mathieu Sinn and Beat Buesser
Share this post:
Next week at AI Research Week, hosted by the MIT-IBM Watson AI Lab in Cambridge, MA, we will publish the first major release of the Adversarial Robustness 360 Toolbox (ART). Initially released in April 2018, ART is an open-source library for adversarial machine learning that provides researchers and developers with state-of-the-art tools to defend and verify AI models against adversarial attacks. ART addresses growing concerns about people’s trust in AI, specifically the security of AI in mission-critical applications.
ART v1.0 marks a milestone in AI security, introducing new features that extend ART to conventional machine learning models and a variety of data types beyond images:
- The toolbox now supports classifiers of the most popular machine learning frameworks including scikit-learn, XGBoost, LightGBM, CatBoost, and GPy, in addition to the deep learning frameworks TensorFlow (v1.x and v2), Keras, PyTorch and MXNet. This enables ART users to defend and verify a large variety of machine learning models with the same methods in a unified environment, including gradient boosted decision trees, logistic regression, random forests, support vector machines, Gaussian processes, decision trees, and more, in addition to neural networks. This major extension enables ART to be applied to a wider range of applications that rely on “classical” machine learning models in security-critical industries like finance (fraud detection, credit risk scoring, etc.), healthcare (medical image analysis, prediction of serious health risks, etc.), or cybersecurity (detection of malware, network intrusion, etc.). We believe that this will facilitate an analysis of adversarial robustness that is much closer to real-world mission-critical deployments of AI.
- Another exciting new feature is the generalisation of the model input format in ART that enables it to work with tabular data. This is essential to enabling enterprise AI users to create more robust and secure AI models as their data is often in tabular form.
The number of reports on real-world exploitations using adversarial attacks against AI is growing, as in the case of anti-virus software, highlighting the importance of understanding, improving and monitoring the adversarial robustness of AI models. ART provides a comprehensive and growing set of tools to systematically assess and improve the robustness of AI models against adversarial attacks, including evasion and poisoning.
In evasion attacks, the adversary crafts small changes to the original input to an AI model in order to influence its behaviour. This can happen, for example, by making imperceptible changes to the pixels of an image of a cat, which may cause this image to be misclassified as an ambulance (see this interactive demonstration). In poisoning attacks, adversaries tamper with an AI model’s training data before it is created in order to introduce a backdoor that can later be exploited via designated triggers (see an interactive demonstration here).
An example of an evasion attack against a non-linear support vector machine (SVM) classifier is illustrated in Figure 1. It was obtained using the v1.0 ART extension for scikit-learn models; for more background and other examples, we refer the reader to the ART sample notebooks. The original inputs of the SVM belong to three different classes: 0 (orange), 1 (blue), 2 (green). The background colors in the three plots show the probabilities assigned to each of the three classes by the SVM. Note that the SVM assigns high probability to the correct classes of the original inputs. The adversarial samples, shown as red points, are obtained by displacing the original inputs. The adversarial samples are highly effective in crossing decision boundaries, while keeping the displacement as small as possible.
Figure 1: Adversarial samples against a non-linear support vector machine on a multi-class classification problem.
The main difference between ART and similar projects is the focus on defence methods and its machine learning framework-independence, which prevents users from being locked into a single framework.
As an open-source project, the ambition for ART is to create a vibrant ecosystem of contributors from both industry and academia. Since its initial release, ART has grown increasingly popular among scholars and practitioners in machine learning and AI security, with more than 940 stars and more than 240 forks currently on GitHub, and a large and increasing number of contributors from inside and outside IBM.
ART has been consistently extended and improved by a global team of IBM researchers located in Ireland, the United States, and Kenya. In addition to ART, IBM Research recently released two other open source toolkits to help build trustworthy AI: the AI Fairness 360 Toolkit and the AI Explainability 360 Toolkit.
We hope the ART project will continue to stimulate research and development around adversarial robustness of machine learning models, and thereby advance the deployment of secure AI applications.
White paper: M-I Nicolae, M Sinn et al.: Adversarial Robustness Toolbox v1.0.0. https://arxiv.org/abs/1807.01069