Abstract:
Nowadays, Machine Learning models are used in many real world AI-based systems. On the other hand, those models are at risk for cyber attacks, which are commonly known as Adversarial attacks. This cyber threat questioned the usefulness and validity of such machine learning models prediction intercepted by attackers. Adversarial Robustness Toolbox (ART) is an open source machine learning security library, which is developed at IBM Research using Python programming language. ART implements many the state of the art adversarial attacks and defense mechanism for conventional machine learning and deep learning models. Just like as a development tool, we can use this library to train and debug machine learning models against different adversarial attacks (i.e evasion, poisoning, extraction, and Inference attacks); and additionally techniques to defend models and to measure model robustness.
In this paper we focus only on the work of data poisoning and evasion attacks supported in current version of “Adversarial Robustness Toolbox v1.5.x”, by evaluating the performance results of those attacks against classical machine learning methods to approach classification tasks in considering adversarial environment. To be specific, We evaluate those attack methods ART supports against four supervised learning methods (i.e Support Vector Machines, Decision Tree, Random Forests, and Gradient Boosted Decision Trees), two machine learning framework (i.e. scikit-learn and lightGBM), two publicly available datasets (i.e. Census Income dataset and MNIST handwritten digit database for tabular and image data type respectively) for classification problems (i.e. binary and multi-label classification respectively).
Keywords: Adversarial Robustness Toolbox, ART Attacks, Adversarial Examples