添加链接
link之家
链接快照平台
  • 输入网页链接,自动生成快照
  • 标签化管理网页链接
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely. As a library, NLM provides access to scientific literature. Inclusion in an NLM database does not imply endorsement of, or agreement with, the contents by NLM or the National Institutes of Health. Learn more about our disclaimer.

  • Majority of research work focuses on image dataset and the Convolutional Neural Network (CNN) ML architecture: As shown in Table III and Fig. 7 , we see that out of the works which do provide details of each component of taxonomies nearly 63% focus on image datasets and CNN architecture, while 75% focus on image dataset and some other machines. Less than 2% of the works focus on datasets other than image or text. However, a large set of deployed mission critical applications in healthcare [ 28 ], industrial systems [ 29 ], or stock market domain [ 30 ] use time series data. This taxonomy-based categorization indicates that such ML applications are significantly ignored and are hence potentially vulnerable in adversarial settings.

    Clustering the state-of-the-art research on adversarial learning based on their ML algorithm and dataset.

  • Certified defenses that are expected to end the arms race may not be applicable to a large set of ML applications that do not use human perceivable datasets: A new domain of research on certifiable defenses require attackers to apply changes beyond a distance threshold on the input data to be classified as a legitimate data sample, so that they can be identified and removed by a human observer [ 31 ]–[ 34 ]. This can effectively curtail arms race with the assumption that a human observer can perceive the changes beyond the threshold. However, such certified defenses have only been proposed for text, voice, image, or video data, which are perceivable by humans. Such defenses are not yet applicable for the plethora of ML applications that use data such as physiological signals, electrical signals, or stock price time series, where changes may not be readily understandable by observers. Moreover, even if the observer is capable of discerning altered inputs, they assume that the observer is always attentive and focused. In many recent ML application failure cases, such as autonomous cars [ 35 ] or aviation [ 36 ], observers were either not focused or overwhelmed with other critical functionalities of the systems.
  • Generation number can be used to prevent adversarial analysis of sub-optimized ML applications, when more evolved system models of the same type exist: The ordering of research works implied by generation numbers can be matched with chronological order to evaluate research works in this domain. For example, a lesser generation work which is more recent than an earlier higher generation work is likely to have lesser research impact unless it is filling a significant gap in generation number. This configurable metric enables structured review of novel research works in this domain following best practices as proposed in some recent research [ 37 ].
  • Taxonomy-based categorization can provide useful directives to ML designers for developing robust ML applications: Given an input data type and an ML architecture, an ML application designer can utilize the taxonomy to potentially develop a more robust system. The designer can choose the most practical attack model that can be envisioned for the system from the attack system taxonomy and then select the most effective defense system from Table III .

II. R elated R esearch

Research on analyzing ML applications in adversarial settings can be studied in three main groups: 1) literature reviews , 2) analytical taxonomies , and 3) best practices . The first group provides a valuable set of knowledge about the state-of-the-art security threats to ML applications and possible solutions, but does not go further than reviewing and discussing the current research [ 38 ]–[ 47 ]. The second type of research provides taxonomies for in-depth analysis and evaluation of the security issues in ML applications [ 20 ], [ 48 ]. The early research by Barreno et al. [ 49 ] provides a taxonomy of attacks and defenses in adversarial ML. In their research, the effects of attacks and defenses on the integrity and availability of the target system are studied. However, the mentioned work leaves more detailed analysis of the arms race between attacks and defenses as future research. Some recent research on adversarial ML covers the arms race [ 11 ], [ 12 ], [ 41 ], [ 48 ], [ 50 ]–[ 62 ]. These works can be studied under three main subcategories: a) data-driven [ 41 ], [ 63 ]–[ 65 ], b) algorithm-driven [ 60 ], [ 62 ], [ 66 ], and c) application-driven [ 52 ], [ 53 ], [ 58 ], [ 61 ], [ 67 ], [ 68 ] taxonomies. As an example of data-driven research, Liu et al. [ 41 ] categorize attacks and defenses based on the data type (i.e. training or testing). In algorithm-driven approach, Ling et al. [ 62 ] develop a taxonomy by providing a platform, which implements several state-of-the-art attacks and defenses. The platform enables testing and evaluating deep learning applications under attacks and the effectiveness of available defenses. As an example of application-driven taxonomies, Laskov et al. [ 53 ] propose a taxonomy that focuses on the effect of adversary’s knowledge on the attack success rate and possible defense solutions in an on-line PDF malware detector (i.e. PDFRATE [ 69 ]). Although these works provide in-depth analysis of the arms race, they are highly dependent on specific data types, algorithms, or applications. In this case, they may not efficiently adapt to cover a wide range of applications or potential future threats. Our research proposes a system-driven taxonomy that surveys the state-of-the-art research based on the system models of its adversary, the ML application, and the defense mechanism to derive a comprehensive model of their interactions. The developed model, is independent from a specific data type, algorithm, or application, that not only covers existing works, but also reveals potential vulnerabilities that have not yet been addressed by the current research. It can also be used to identify specific defense strategies that can be added to an existing ML application to manage its vulnerabilities to attacks. Finally, the third group contains best practices, which provide some intuitive suggestions on the potential research gaps and high level guidelines for filling them [ 37 ]. Our work provides a framework to evaluate recent research in terms of the guidelines proposed in visionary research [ 37 ] and similar works. As an example, the works by Biggio et al. (2015) [ 70 ] and Zhang et al. (2016) [ 71 ] implement the same attack exploiting the same adversary model (Biggio et al. (2013) [ 17 ]), but achieve different success rates. However, a close look at Biggio et al. (2015) [ 70 ] suggests that a defense already exists. Since Zhang et al. (2016) [ 71 ] is a later work, to maintain the arms race, it is expected that the work should have considered the defense proposed in Biggio et al. (2015) [ 70 ] as a part of the ML application. However, our taxonomy shows otherwise. Although Zhang et al. (2016) [ 71 ] adds great value by improving the success rate of the attack, it however destructs the arms race, since there have been no other works that continue from where they left off.

III. S ystematic A nalysis of T arget M odel

Fig. 3 shows a generic process model in ML-based applications utilizing supervised learning (in the rest of the paper, ML refers to supervised learning for classification). A ML application operates in two phases: 1) training and 2) testing . In the training phase, labeled raw data (i.e. training raw data ) is used to set the machine parameters in a way to match a significant amount of the raw data samples to their corresponding labels. In the testing phase, labels are assigned to unlabeled raw data by passing through the trained machine. The input data can have different types, such as text [ 72 ], signal [ 73 ], 2D image [ 74 ], 3D image [ 75 ], or video [ 76 ]. Fig. 7 provides a taxonomy of the state-of-the-art research based on their dataset and ML algorithm. The figure shows most of the works focus on image and CNN, while signal, text, video, and classic ML draw less attention. Table III summarizes the specifications of the state-of-the-art target systems under perturbation attack. The accuracy (Acc.) is defined as the number of correct classification of the test data samples, out of the total number of samples. Any data other than training and testing raw data (a.k.a. original data ), but with the same type is called surrogate raw data , such as traffic signs images of another country for a ML applications that have been trained by US traffic signs. Surrogate data may have the same distribution as the original data.

Fig. 3:

A generic system model of ML applications.

Feature Extraction:

After receiving the input raw data, some preprocessing algorithms such as normalization or filtering are applied on the data for noise reduction. In the next step, a feature extractor derives a feature vector from the input raw data by mapping the samples in the raw data domain to samples in the feature domain. The feature extractor is often intended to reduce the data dimensionality that assists in classifying the data [ 87 ]. It is noteworthy that some recent ML algorithms, such as CNN do not need a separate feature extractor component and features are extracted as a part of the training process, which still follow the model in Fig. 3 . Derived features from training, testing, and surrogate raw data, are respectively called, training , testing , and surrogate features .

Classification:

Fig. 4 categorizes ML algorithms in two main groups: 1) classic ML , which consists of traditional classification algorithms with typically low computational cost, and 2) deep learning , which is computationally intensive. The ML algorithm receives data features as input and computes cost or loss values , which indicate the likelihood of the input to belong to each class [ 88 ]. Typically higher value of cost or lower value of loss are interpreted as higher likelihood of belonging to the class. The parameters of the ML algorithm are set in a way to minimize the loss for features in their corresponding class. After obtaining the loss vector for a given input feature, the label for a class with the lowest loss is assigned to the input. The most likely class may be the correct class or it can be misclassified affecting the machine accuracy. Typically ML algorithms cannot provide a perfect classification, and may never reach 100% accuracy. A classifier can be trained off-line or on-line ; in off-line mode, the classifier is only trained one time at the beginning before the system usage, while in on-line training, in addition to the initial training, the trained machine parameters can be updated through the usage of the system with new collected data. Although on-line training (a.k.a. active learning ) can enhance the ML accuracy, it can also increase the application vulnerability against attacks, since there is a more chance of infecting the training data [ 89 ]. Perturbation attacks as a subset of presentation attacks are only launched at the test time, hence in this paper we only focus on off-line ML algorithms. There are various approaches for classification, which are one-class, binary, and multi-class classifications. In all these approaches, the data domain is divided into two parts: 1) source , which contains malicious data and 2) target , which includes the classes of legitimate data. An adversary manipulates the source data to be recognized as a target data sample by the classifier.

IV. S ystematic A nalysis of A ttack M odel

The attack model can be developed based on the four main attributes of an adversary [ 17 ]: 1) knowledge , 2) capability , 3) goal , and 4) strategy . The attack system specifications columns in Table III evaluates state-of-the-art perturbation attacks. The adversaries Success Rate (SR) is defined as the percentage of non-target data samples, which are misclassified as the target class out of the total number of attacks.

A. Adversary’s Knowledge

As seen in Fig. 5 , two main types of knowledge may be available for an adversary: 1) data and 2) algorithm . In each type, the adversary’s knowledge can be limited or full . For instance, a neural network architecture can be specified by the number of layers, number of nodes in each layer, and type of transition functions, while an adversary with limited knowledge only knows about some of these entities. Table I evaluates the attacks in various works, based on the available knowledge at the beginning of the attacks. It is noteworthy that the adversary may infer more knowledge while designing the attack. For example, with knowledge about the classifier architecture and parameters, the adversary can obtain losses in further steps. This is not indicated in the table and is further discussed in Sec. IV-D (adversary’s strategy).

TABLE I:

The adversary’s knowledge model for perturbation attacks (—: no knowledge, ○: limited knowledge, and ●: full knowledge).

Reference Data Algorithm Overall
Raw (Unlabeled) Loss/Cost Class Label Feature Extractor Classifier
Surrogate Testing Training Arch. Parameters
Goodfellow et al. (2014) [ 12 ]
Biggio et al. (2015a,c) [ 70 ]
Biggio et al. (2015b,d) [ 70 ]
Graese et al. (2016a:f) [ 77 ]
Papernot et al. (2016) [ 78 ]
Zhang et al. (2016a,c) [ 71 ]
Zhang et al. (2016b,d) [ 71 ]
Bulò et al. (2017a:c) [ 79 ]
Cao et al. (2017a:h) [ 80 ]
Carlini et al. (2017a:c) [ 81 ]
Demontis et al. (2017a,c) [ 82 ]
Demontis et al. (2017b,d) [ 82 ]
Osadchy et al. (2017a:d) [ 83 ]
Papernot et al. (2017a:d) [ 84 ]
Xu et al. (2017a:m) [ 85 ]
Zhang et al. (2017) [ 73 ]
Baluja et al. (2018a:b) [ 86 ]

Lowercase letters distinguish among various settings in the experiments (“a:d” indicates all the alphabets between ‘a’ and ‘d’ including both).

1) Knowledge about the Data:

According to the target system model in Sec. III , an adversary may have knowledge about four types of data in the system: 1) raw data, 2) data feature, 3) loss/cost, or 4) class label (each type can appear in different forms of surrogate, testing, or training).

Raw Data and Its Features:

Raw data domain usually has high dimension, which makes it challenging for an adversary to find an attack point in an affordable time interval. Access to data features with lower dimensionality facilitates searching in reduced search domain for more successful attacks [ 90 ].

Loss/Cost Values:

By access to loss/cost values of the input test data, an adversary can pick a data point, and reproduce its corresponding input through reverse engineering [ 91 ]. Also, loss/cost values indicate the likelihood of acceptance the attack point by the target class [ 23 ].

Class Labels:

With access to labels, an adversary can present unlabeled surrogate data and observe the outputs to obtain a labeled surrogate dataset for training a surrogate classifier (or substitute classifier ), which resembles the target classifier [ 92 ]–[ 94 ]. Full knowledge about the surrogate classifier is exploited to generate more successful adversarial samples.

2) Knowledge about the Algorithm:

An adversary may also know the algorithms, which the target system applies on data.

Feature Extraction:

Many attacks manipulate the data in the feature domain [ 17 ], [ 73 ], [ 95 ]–[ 97 ]. Here, the knowledge about the feature extractor and subsequently the feature domain is available. The adversary crafts an attack feature point and finds its corresponding point in raw data domain, which is only feasible for invertible feature extractors.

Classification:

An adversary’s amount of knowledge about the classifier depends on the ML attributes: 1) architecture , including type (e.g. Neural Networks (NNs)) and structure (e.g. number of layers in NNs) and 2) parameters , such as weights of edges between nodes in NNs, which are set during training. By knowing the ML type, the adversary can train an accurate substitute classifier to initiate more effective attacks (i.e. spending lower time with higher success rate) [ 84 ]. In Table I , limited knowledge about the architecture, such as solely the ML type is indicated by an empty circle (○). Knowing the structure in addition to its type leads to more accurate reconstruction of the classifier. Also, knowing the ML parameters means full access to the classifier that also enables testing large number of samples on the replica (e.g. by brute-forcing) until finding 100% successful adversarial samples.

In addition to the provided taxonomy, there are some high level qualitative measures to determine the amount of available knowledge for an adversary: black-box (i.e. no knowledge), gray-box (i.e. limited knowledge), and white-box (i.e. full knowledge) [ 55 ]. The white-box scenario is unlikely to happen in practice, but as an example, compressing and deploying trained ML models on data centers for smart-phones may allow white-box access, where the initial models can be derived through reverse engineering [ 98 ].

B. Adversary’s Capability

In practical scenarios, an adversary can only alter a portion of the data (surrogate, testing, or training) or has limited access to the target system. The portion size or the number of possible accesses have key roles in adversary’s success rate [ 99 ]. For example, in attacks on traffic sign recognition application in autonomous vehicles, in the presence of an external observer (e.g. a passenger), the adversary is not free to make any changes in signs due to the chance of the detection of abnormal changes by the observer [ 100 ]. As another example, in many adversarial ML research, full knowledge of class labels is assumed for the adversary ( Table I ), which is obtained either through stealing labeled data or oracle access to the ML application. Although by assuming full knowledge for an adversary, the system security is tested under the worst case scenario (from the target system viewpoint), this assumption does not always hold [ 101 ]. For instance, in biometric authentication systems, the application can terminate the oracle access if it detects few illegitimate inputs. At last, since the focus of the paper is on perturbation attacks, it is assumed the adversary cannot directly inject altered data features, losses/costs, or class labels, or change the feature extraction or classification algorithms (i.e. mostly related to network or system security). In this case, the adversary can only alter the input raw data.

1) Altering the Training Raw Data:

One of the most common way of attacking the ML algorithm is known as poisoning attack , where an adversary alters the training raw data (either the raw data itself, or its labels) to alter the classifier’s parameters in its favor. In general, any attempts for illegitimate changes in the classifier settings is called causative attack , which is a subset of active attacks ( Fig. 2 ). Considering the amount of possible alteration at the training time (i.e. limited [ 52 ] or full [ 102 ]), the classifier boundaries can be moved toward the source data class, which leads to misclassification of source data as the target class [ 54 ].

2) Altering the Testing Raw Data:

An adversary can alter the available surrogate or testing source raw data to craft adversarial samples to be classified in the target class at the test time. Perturbing the data may trick the system, but too much perturbations can raise suspicion [ 116 ].

Altering Data Features (Surrogate or Testing):

Due to low dimensionality of the feature domain, finding attack points in this domain is more time efficient. Raw adversarial samples can be retrieved by applying an inverse feature extraction on the attack points in the feature domain [ 174 ].

Altering Loss/Cost Values or Class Labels:

To the best of our knowledge, there is no attack that applies reverse engineering on these data types.

C. Adversary’s Goal

The adversary’s goals can be categorized in three main groups based on violating three high level security aspects [ 55 ]: 1) confidentiality , 2) integrity , or 3) availability , also known as CIA triad. These goals can be either: 1) indiscriminate , where any class’s data or any components of the ML application are prone to attack or 2) targeted , which attacks a specific class or component. In an indiscriminate perturbation attack, the source data is misclassified in any target classes, while in the targeted version, it should be mis-recognized as data in a specific target class [ 54 ].

Confidentiality:

Confidentiality violation happens when information of the application users or components is disclosed. Finding out whether a dataset belongs to ML training dataset [ 93 ], extracting training data [ 175 ], stealing the ML model [ 176 ], or predicting users critical information [ 177 ] are some examples of violating confidentiality. These type of attacks can be indiscriminate when information from random users or components of the application is stolen, or targeted when victims are some specific users or components.

Integrity:

Penetrating an ML application using illegitimate data leads to unauthorized access to information and subsequently integrity violation. For instance, acceptance of malware samples in malware detection systems [ 82 ], impersonation of another user in facial biometric systems [ 6 ], or detecting an unknown object (e.g. a fire hydrant) as a traffic sign in autonomous vehicles [ 100 ]. Integrity violation can be indiscriminate when a malicious sample can be recognized as any legitimate data. In contrast, in targeted attacks, a malicious sample should be recognized as a specific legitimate sample.

Availability:

Availability is disrupted when the application does not accept legitimate data samples. For instance, recognizing legitimate data as malware in malware detectors [ 82 ], rejecting login attempts from legitimate users in biometric systems [ 178 ], or ignoring traffic signs by autonomous vehicles [ 100 ]. In indiscriminate availability violation, any legitimate samples are recognized as malicious data, while in targeted attack, specific legitimate samples are rejected.

D. Adversary’s Strategy

Attack strategies can be studied in two main groups ( Fig. 2 ):

1) Passive Attacks:

Their aim is to derive information about the application or its users [ 179 ]. Two common approaches for passive or exploratory attacks are: 1) probing [ 93 ] and 2) Reverse Engineering (RE) [ 180 ].

Probing:

In this attack, an adversary submits some queries to the system and observes the responses. For instance, the adversary may provide input data to the system and observe the outputs (e.g. class labels), also known as oracle access , to gain some information about the training dataset [ 49 ]. Probing can determine whether a raw data sample belongs to the training dataset [ 93 ]. In addition, finding the relationship between changes in the inputs and outputs enables testing randomly guessed input data to reach an input data in the target class. Finally, the adversary can prepare synthetic training data by performing several tests on the oracle and plan for more advanced attacks [ 54 ]. Synthetic data is a type of surrogate data, which is close to the original data distribution.

Reverse Engineering:

Based on some information about the distribution of the data (possibly through eavesdropping or probing), an adversary may guess the class boundaries or develop a substitute model of the classifier [ 180 ]. As an example, the functionality of a black-box classifier can be stolen by labeling some surrogate data through oracle access and training a deep learning based substitute model [ 181 ].

2) Active Attacks:

These attacks disrupt the application’s normal operation. Active attacks can be launched either through: 1) presentation or 2) causative attacks ( Fig. 2 ).

Presentation Attack Strategies:

In this approach, an adversary is only capable of altering the input raw data (cannot change any data or algorithm inside the application). Presentation attacks are the most popular type of active attacks, where the adversary presents falsifying raw data at the test time to be misclassified as the target class data. These attacks can be categorized in three main groups based on the origin of the data, which is exploited for attack: 1) zero-effort , 2) replay , and 3) perturbation attacks. In the simplest case, Zero-Effort (ZE) attack, the adversary chooses an attack sample among the source data without making any changes in it [ 182 ]–[ 184 ]. Due to imperfect classification by the ML algorithm, the sample might be misclassified as the target class. Unlike the ZE attack, in Replay Attack (RA), the adversarial sample is picked among the stolen [ 185 ] or artificially crafted (i.e. artefact [ 186 ]) target class data to mimic the target data features [ 73 ]. There are various strategies to launch a replay attack. Stealing the data (e.g. an image of a traffic sign) or its origin (i.e. the real traffic sign) are the most common one. In practice, small changes (e.g. white noise) are applied to the stolen data to pass some defenses, such as similarity check ( Sec. V-B ) [ 187 ]. Reverse engineering on features [ 73 ] or creating different representations of the same target data using generative models [ 188 ] are common methods for crafting artefacts. In presence of an external observer (e.g. passengers in autonomous vehicles), there is a high chance of detecting zero-effort and replay attacks. As a more advanced strategy, in perturbation attack [ 54 ], the adversary obtains an adversarial sample by altering a source data sample (e.g. by adding carefully crafted perturbations, a.k.a. adversarial noise ), to be accepted as the target class data, while remains hidden from the observer [ 189 ], [ 190 ]; for example, spraying some paint on a stop sign to be recognized as a speed limit sign by the object recognition system in autonomous vehicles [ 100 ]. For obtaining adversarial samples, the adversary faces a search problem, which may be solved through two main approaches ( Fig. 6 ): 1) non-guided search , where an adversary has no clue about the classifier function and 2) guided search , where the adversary knows the function. Grid search [ 191 ] or random search [ 192 ] are examples of non-guided search techniques. The second approach, which has drawn more attention, suggests two strategies: 1) optimization [ 11 ], where due to non-linearity of the classifier function in most cases, is narrowed down to non-linear programming or 2) game theory , where a game is defined between the ML application and the adversary, to find effective adversarial samples [ 26 ], [ 193 ], [ 194 ]. In game theoretical strategies, on one hand, the adversary attempts to generate an adversarial sample to be classified as the target class data, and on the other hand, the ML application attempts to recognize the sample and reject it [ 42 ], [ 195 ], [ 196 ]. Generative Adversarial Nets (GAN) [ 142 ], [ 143 ] is a popular example of game theoretical strategies for generating adversarial samples. Specially, applying deep learning to develop GAN can lead to crafting samples with a high chance of success [ 144 ]. A GAN is composed of a generator and a discriminator [ 142 ]. The generator produces adversarial samples, which the discriminator attempts to recognize them. Indeed, there is a co-evolution between these two modules, which optimizes the functionality of both of them. The game between the generator and the discriminator can roughly be interpreted as a set of coupled optimization problems (i.e. minimax game [ 142 ]), where both opponents (i.e. the adversary and ML application) do their best to maximize the success rate and classification accuracy, respectively [ 197 ]. The game theoretic methods can also be used for defense by reinforcing the classifier in the ML application, to block the adversarial samples [ 198 ]. In general, game-theoretic methods use losses in both attack and target systems to design attack and defense strategies.

With knowledge about the ML algorithm, the adversary can derive the loss function and exploit it as the objective function of the problem. The objective is to find an adversarial sample, which minimizes the loss. There are various methods to obtain the objective function, such as deriving it from a substitute classifier ( Sec. IV-A2 ). An input crafted to break the substitute classifier can be effective on the original one, due to transferability properties of ML algorithms [ 92 ]. The derived objective function from commonly used ML techniques are non-linear and non-convex, which means the optimal solution is likely to be a local optimum, however still can be used for attack [ 78 ]. Based on the properties of the derived loss function, the optimization-based techniques are categorized in three main groups ( Fig. 6 ): 1) non-differentiable when derivatives of the loss function are not defined on the search domain, 2) differentiable when derivatives of the function can be calculated, and 3) twice differentiable when the second order derivatives of the function on the search domain are available. Table II lists state-of-the-art perturbation attack strategies. The adversarial sample manufacturing problem can also be defined as a game between attack and target systems, where on one hand, the adversary attempts to generate adversarial samples to be classified as the target class, and on the other hand, the ML application attempts to recognize and reject the samples [ 42 ]. For example, in Generative Adversarial Networks (GANs) , the game between the generator and the discriminator can be interpreted as a set of coupled optimization problems, where both opponents (i.e. the adversary and ML application) attempt to maximize the success rate and classification accuracy, respectively [ 142 ], [ 197 ], [ 259 ].

TABLE II:

Summary of state-of-the-art perturbation attack strategies using guided search methods.

Attack Strategies
Type Method Description
Non-Differentiable Evolutionary Algorithms (EA) [ 23 ], [ 103 ] In each generation, the features that reduce the loss values are kept, while the rest are replaced by mutants.
Generative Models [ 104 ] Generating adversarial samples by developing generative models using machine learning algorithms.
Gradient Estimation Attacks [ 105 ], [ 106 ] Estimating the loss function’s gradient using the finite differences method [ 107 ] to calculate the amount of perturbation.
Zeroth Order Optimization (ZOO) [ 108 ] Using zero order optimization to estimate the gradients of DNN toward generating the adversarial samples.
Local Search Optimization [ 109 ] Running the local search algorithm using the loss values to find a perturbed data, which minimizes the loss.
Hill-Climbing (HC) [ 23 ], [ 110 ]–[ 114 ] A source data sample is iteratively modified until reaching a sample, which reduces the loss function.
Differentiable Adam Optimizer [ 81 ], [ 115 ] Minimizing the amount of distortions required to generate adversarial samples using Adam optimization algorithm.
Deepfool [ 116 ] Minimizing the amount of required perturbations to change the source data class using gradient-based optimization.
Elastic-net Attacks to DNNs (EAD) [ 117 ] Minimizing the linear combination of the loss function, and L 1 and L 2 norms of the amount of perturbation.
Gradient Descent (GD) [ 118 ], [ 119 ] The loss function’s gradient is derived to find the direction of changes in a source data sample to reach the target class.
Dense Adversary Generation (DAG) [ 120 ] The losses of belonging to both source and target classes form the optimization problem, which is solved by GD method.
Jacobian-based Saliency Map Attack (JSMA) [ 54 ], [ 121 ] Reducing the search space dimensionality using adversarial saliency map [ 122 ] for Jacobian-based objective functions.
JSMA-Z [ 78 ] A version of JSMA, which uses the second to the last layer (Z) outputs in deep neural networks.
Carlini and Wagner’s (C&W) Attack [ 81 ], [ 123 ] JSMA using the softmax layer output, as a modification on JSMA-Z, to be resilient to defensively distilled networks [ 78 ].
RP 2 Algorithm [ 124 ], [ 125 ] Adding physical constraints (e.g. samples printability) to JSMA to find robust physical world adversarial samples.
Fast Gradient Sign (FGS) [ 12 ], [ 54 ], [ 76 ], [ 126 ], [ 127 ] Perturbing the source data toward reaching a neighbor class by applying the sign function on gradients.
Least Likely Class (LLC) [ 128 ] A version of FGS method for targeted attacks, where the adversarial samples fall in a specific target class.
Iterative Fast Gradient Sign (IFGS) [ 74 ], [ 129 ] Limiting the amount of alteration in each point of the source data samples, which leads to multiple runs of FGS.
Projected Gradient Descent (PGD) [ 129 ] Another name for IFGS, which is used in adversarial ML literatures.
Iterative Least Likely Class (ILLC) [ 74 ] A version of IFGS method for targeted attacks, where the adversarial samples fall in a specific target class.
Random Perturbation with FGS [ 130 ] Adding a small random perturbation to the initial point before applying FGS to avoid its non-smooth vicinity.
Random Perturbation with LLC [ 130 ] Adding a small random perturbation to the initial point before applying LLC to avoid its non-smooth vicinity.
Robust IFGS [ 83 ] Filtering the adversarial samples before inputing to the objective function in FGS to be resilient to filtering defenses.
Momentum IFGS [ 131 ] Integrating the momentum method [ 132 ] with IFGS to stabilize changes directions and avoid local optima.
Universal Adversarial Sample [ 133 ] A version of deepfool that generates a perturbation to escape most of the classes of data with the same distribution.
Quasi-Newton’s Methods (e.g. BFGS) [ 134 ], [ 135 ] Approximating the second order derivatives in the process of generating the adversarial samples.
Limited-memory BFGS (L-BFGS) [ 11 ] A version of BFGS algorithm, which has been designed to use limited amount of computer memory.
L-BFGS-B [ 136 ]–[ 138 ] An extension of L-BFGS, which has been designed for handling box constraints on the perturbations.
Spatial Transformation [ 139 ] Finding the minimum displacement of the source data in generating adversarial samples using gradient-based optimization.
Wasserstein Adversarial Samples [ 140 ] Minimizing the amount of distortions required to generate adversarial samples using Wasserstein distance metric.
Twice Differentiable Newton’s Method [ 141 ] Obtaining the Newton’s direction using the first and second order derivatives to calculate the perturbations.
Game Theory Generative Adversarial Network (GAN) [ 142 ]–[ 145 ] Generating adversarial samples using the generator component in GAN.
Adversarial Transformation Network (ATN) [ 91 ] A version of GAN, which trains a generator receiving source data to generate minimally perturbed adversarial samples.
Universal Adversarial Network (UAN) [ 104 ] Generating adversarial noise using generative models developed by reverse engineering of the target classifier.
Two-Person Game [ 146 ], [ 147 ] Developing a two-person constant sum game, where adversary perturbs the input and recognizer should detect distortions.

Causative Attack Strategies:

This approach targets the classifier itself, including its parameters or architecture [ 260 ]. Poisoning [ 261 ] is the most common causative attack strategy, which is categorized as: 1) label contamination [ 262 ] and 2) red herring [ 102 ]. In label contamination, the training data labels are altered to change the classifier’s boundaries [ 263 ]. Label contamination can be applied on training data with both hard class labels (e.g. 0–9 digits as labels for handwritten digits dataset [ 148 ]) or soft class labels (e.g. losses derived from the ML algorithm as labels to train another machine [ 98 ]). Label flipping is a special case of label contamination, which deals with hard class labels. For instance, in the absence of an external observer, an adversary can flip the labels of the source class training data to expand the target class boundaries toward the source data. Selecting the most influential labels on altering the class boundaries can be formalized as an optimization problem [ 262 ]. In this strategy, after training, a source data sample can be accepted as the target class data without perturbing it [ 54 ]. In presence of an external observer, flipping the labels raises suspicion, where an adversary should poison the data itself, also known as red herring. The adversary perturbs the unlabeled training data to shift the target class boundaries toward the source data. At the same time, the amount of changes in the data should be low enough to be hidden from the observer. This problem can be formalized as an optimization problem, where knowledge about the target class data or ML algorithm can elevate the attack success rate [ 264 ].

V. S ystematic A nalysis of D efense M odel

Defense methods are designed based on two main approaches [ 265 ]: 1) proactive when a target system is prepared for potential threats before attack [ 266 ] and 2) reactive when defense strategies are applied after occurrence of an attack, such as machine unlearning after poisoning attacks [ 267 ]. Most defenses take the first approach to prevent the damage as much as possible [ 39 ]. Fig. 8 categorizes the proactive defense methods against perturbation attacks in two main groups: 1) modifying the classifier and 2) adding a specialized detector. Table IV provides brief descriptions of state-of-the-art defense strategies in the taxonomy. The specifications of applications using these defenses are summarized in the defense system specifications in Table III . In the table, the third column shows the accuracy of the target ML after applying the defense method. The defense gain indicates the amount of reduction in adversary’s success rate after applying a defense. Various defense strategies can also be combined (a.k.a ensemble defenses ) for higher defense gain; however, this approach does not guarantee an improvement [ 268 ].

TABLE IV:

State-of-the-art proactive defense strategies against perturbation attacks (MC: Modified Classifier and SD: Specialized Detector).

Defense Strategies
Type Method Description
Adversarial Sample Thwarting (MC) Data Transformation [ 77 ], [ 199 ]–[ 204 ] Transforming the input data (e.g. scanning an image, scaling, or averaging) to reduce the adversarial noise.
Ensemble Data Transformation [ 205 ] Combining various data transformation methods (e.g. image cropping and quilting).
Random Data Transformation [ 206 ] Applying random data transformation methods (e.g. scaling an image data to a random size).
Magnet [ 207 ] Reducing the adversarial noise by passing the input data through an auto-encoder before classification.
Pixeldefend [ 208 ] Reverting input data samples to their corresponding training data distribution before classification.
Noise Filtering [ 83 ] Reducing the adversarial noise by applying filters, such as median filter.
Training Process Modification (MC) Adversarial Training [ 129 ], [ 145 ], [ 198 ], [ 209 ]–[ 215 ] Adding an adversary class to the training dataset containing adversarial samples for training.
Constrained Adversarial Training [ 33 ] Generating adversarial samples for adversarial training by adding Gaussian noise to the target data samples.
Adversarial Boosting [ 210 ] Augmenting the training dataset to train boosted classifiers (e.g. boosted DT) in each boosting round.
Batch Adjusted Network Gradients [ 216 ] Scaling the gradients during training to reduce the effect of outliers in the training data on the CNN weights.
Ensemble Adversarial Training [ 130 ], [ 217 ] Using adversarial samples generated for other ML models to train the target model.
Curriculum Adversarial Training [ 218 ] Generating adversarial samples for wide range of adversaries with various attack models for training.
Adversarial Polytope [ 219 ], [ 220 ] Training the ML considering adversarial classes formed by convex outer of norm-bounded perturbations.
Convex Relaxation [ 221 ] Applying convex relaxation in DNN training to find more accurate approximation of minimal loss values.
Defensive Distillation [ 78 ] Reducing the gradient of the loss function using soft class labels (e.g. loss values) for training.
Feature Masking [ 71 ], [ 222 ], [ 223 ] Selecting specific features for classification, which are less sensitive to adversarial noise.
Foveation-Based Classification [ 224 ] Applying the classifier on different regions of an input data (e.g. an image).
Gradient Inhibition [ 225 ] Linear enlargement of DNN’s weights to reduce the sensitivity of the loss values to adversarial noise.
Gradient Masking [ 98 ] Reducing the gradient of the loss function to reduce the ML sensitivity to distortions in input data.
High-Level Guided Denoiser (HGD) [ 226 ] Including the changes in the activation values of DNNs hidden layers in the loss function for training.
Input Gradient Regularization [ 227 ] Applying double-back propagation [ 228 ] in training DNNs to minimize their sensitivity to small perturbations.
Randomized Smoothing [ 32 ]–[ 34 ] Adding random noise to DNN’s weights in each layer to reduce the loss’s sensitivity to distortions.
Reject-On-Negative-Impact (RONI) [ 99 ] Eliminating the samples with the highest impact on classes boundaries from the training dataset.
Stability Training [ 229 ] ML training using sets of perturbed training samples to reduce the ML sensitivity to perturbations.
Quantized Activation Function [ 230 ] Quantizing the activation functions output in DNNs during training and testing to avoid adversarial samples.
Machine Learning Modification (MC) Classifiers Randomization [ 180 ] Randomly choosing a classifier from a set, for each test query, to avoid classifier’s functionality theft.
Enclosing the Target Class [ 231 ] Surrounding the target class area to avoid blind spots, by applying more complex classifiers.
Competitive Overcomplete Output Layer [ 232 ] Combining multiple nodes results in output layer of DNNs for deciding the test data class.
Convex Learning [ 233 ] Applying a convex optimization approach for more accurate training of classifiers.
Deep Contractive Network (DCN) [ 234 ] Training a deep neural network using the constant features of the training data.
Infinity-Norm SVM [ 235 ] Modifying the SVM training, to involve more number of features in determining the class boundaries.
Low-Sensitive ML [ 236 ]–[ 239 ] Putting constraints on the ML parameters during training, to reduce its sensitivity to adversarial noise.
Multiple Classifier System [ 16 ], [ 70 ], [ 159 ] Aggregating multiple classifiers outputs using ensemble methods (e.g. bagging or random subspace methods).
Non-Linear ML [ 12 ], [ 240 ] Using highly non-linear classifiers, such as Radial Basis Function (RBF) neural networks or RBF SVMs.
Cross-Lipschitz Regularization [ 241 ] Minimizing the sensitivity of NNs output to perturbations, using cross-Lipschitz regularization during training.
Parseval Regularization [ 242 ] Reducing the sensitivity of ML output to adversarial noise, by regularizing the ML parameters.
Stochastic Activation Pruning [ 243 ] Defining a minimax zero-sum game between the ML and adversary to optimize the ML parameters.
Multi-Model-Based Defense [ 244 ] Constructing a family of classifiers from the target classifier to be chosen randomly at the test time.
Randomized Classifier [ 180 ] Random selection among a distribution of trained classifiers for testing each input sample.
Hybrid Classifier [ 245 ] Combining kNN and DNN classifiers, where kNN is applied on the learned parameters in each layer of DNN.
Region-Based Classification [ 80 ] Labeling a test sample based on the intersection volume of its surrounding area with a given class area.
Adversarial Sample Detection by Specialized Detectors (SD) Feature Extraction [ 73 ], [ 246 ]–[ 251 ] Detection by extracting new data features, which are more sensitive to adversarial noise.
Feature Extraction [ 252 ]–[ 254 ] Analyzing the data representations in the middle layers of deep learning algorithms to detect attacks.
Feature Squeezing [ 85 ], [ 255 ] Rejecting samples with different classification results before and after transformation (e.g. spatial smoothing).
Characterizing Adversarial Subspaces [ 250 ] Extracting spatial features (i.e. local intrinsic dimensionality) from samples to detect adversarial samples.
Multiple Detectors [ 256 ] Analyzing various data features, such as soft-max distributions or manifold analysis, by consecutive detectors.
Surrogate Classifier [ 28 ], [ 257 ], [ 258 ] Applying a separate add-on classifier (e.g. polynomial kernel-based SVM) to detect the adversarial samples.

A. Modifying the Classifier

Adversarial samples tend to be inside the target class boundary, which typically covers a vast area containing blind spots [ 11 ] (parts of the target class, where there is no training data). A classifier can prevent access to the blind spots by utilizing adversarial sample thwarting or eliminate the spots by modifying its training process or ML algorithm.

1) Adversarial Sample Thwarting:

This approach is based on neutralizing the perturbations in adversarial samples before classification. Several methods reduce the possible adversarial noise in the test data, such as data transformation [ 199 ], noise filtering [ 83 ], or mapping to normal samples [ 207 ].

2) Training Process Modification:

These defenses revise the type and arrangement of the training data or modify the training process, to avoid misclassification of the adversarial samples in the target class [ 214 ], such as adversarial training [ 129 ] or modifying the training process [ 78 ].

3) ML Algorithm Modification:

Modifying the classifier by refining its ML algorithm for drawing more accurate boundaries around each class training data to avoid falsifying test data is another defensive approach [ 235 ]. Applying non-linear ML algorithms [ 240 ] and designing robust ML algorithms [ 233 ] are common modification approaches.

B. Adding a Specialized Detector

In other type of defense strategies, a specialized detector is added as a separate component to the target system. With an add-on attack detector, adversarial samples may be blocked (based on the detector accuracy) before classification [ 246 ], [ 252 ], [ 257 ]. Many detectors extract new data features, which are more sensitive to adversarial noise for accurate detection [ 249 ]–[ 251 ]. Feature squeezing [ 85 ] and applying consecutive detectors [ 256 ] are examples in this category.

VI. S ystem -D riven T axonomy of the A rms R ace

In this section, a unified system-driven taxonomy is developed to compare state-of-the-art research works in adversarial ML. The arms race between attack and target systems in AMLC are modeled and analyzed by merging all the provided models and taxonomies throughout the paper ( Table VI ).

A. Research Works Ranking

In AMLC, each cycle can be studied as a generation , where the arms race lead to the emergence of new types of target, attack, or defense systems due to contributions of research works through years ( Table III ). There are various ways of arranging these systems in a sequence. In this research, a heuristic metric, called research impact , is defined to rank the research works in adversarial machine learning:

Definition VI.1. Research Impact (RI):

Given an adversarial machine learning cycle, research impact is a heuristic metric that facilitates ranking the research works based on the differences in the specifications of their systems under study.

As seen in Table V , zero and one RIs are assigned to each AMLC model based on its target, attack, and defense systems specifications in the proposed taxonomies ( Fig. 2 , ,4, 4 , ,5, 5 , ,6, 6 , & 8 ). The RI of the defense strategy for defenseless ML applications is considered as zero. For ease of analysis, the RI can be represented as bits. Target, attack, and defense models impacts are then combined as a bit string to represent a generation number for each research work. As seen in Table VI , the generation number is calculated by converting its binary representation to the corresponding integer value. In each generation with a greater number, progress is seen at least in one of the models. In the second step, the works are sorted in ascending order of the generation numbers ( Table VI ). The two vertical dots in some rows in Table VI show the gap between the works indicating open research problems. It is noteworthy that the choice of research impact for analyzing the arms race, the particular arrangement of the impacts in a bit string, and the way of calculating the generation numbers can be changed based on the analysis purposes. Hence the generation number does not represent the quality of works, but merely is an indicator to explore the field. For example, Table VI gives higher priority to the RI of ML applications (left columns), then the adversaries (middle columns), and at last the defense systems (right columns). This can be useful for researchers pursuing the most recent system under the most successful attack strategy. The taxonomy can also suggest the most relevant adversarial model to a researcher who wants to analyze and improve a given ML application. In addition, vector operations, such as dot product, can be applied on the bit strings to evaluate the similarity between competing research works and extract the trends in these works.

TABLE V:

Research impact of systems specifications in AMLC.

Zero Impact (○) One Impact (●)
Target Data One-Dimensional (e.g. signal) Multidimensional (e.g. image)
Architecture Classic ML Deep Learning
Accuracy (Acc.) ≤ 50 > 50
Attack Knowledge Limited Full
Capability Limited Full
Strategy Reverse Engineering & Optimization Game-Theory
Goal Indiscriminate Targeted
Success Rate ≤ 50 > 50
Defense Strategy Modified Classifier (MC) Specialized Detector (SD)
Acc. after Defense ≤ 50 > 50
Defense Gain ≤ 50 > 50

B. Usage Scenario

The arms race between the attack and target systems in Table VI , indicates the focus of the state-of-the-art research. New research can also be added to the taxonomy based on its rank calculated by the research impacts of its specifications.

1) Discovering Open Problems:

Ideally, it is expected that the ranks in Table VI match the chronological order of publishing each research, which does not completely apply in practice. This reveals the potential open problems in the field. For instance, the gap between the works by Demontis et al. [ 82 ] (generation #395) and Zhang et al. [ 73 ] (generation #815) indicates the need of testing classic machine learning algorithms applied on low dimensional data under more complex attacks (e.g. with unlimited capabilities and advanced attack strategies). Also, the gap between the works by Bulò et al. [ 79 ] (generation #1410) and Grease et al. [ 77 ] (generation #1862) shows the lack of work on testing advanced targeted attacks on classic ML algorithms. As another example, the work by Baluja et al. [ 86 ] (generation #2042) shows still there is no effective defense strategy against advanced attacks using ATN on CNN-based applications ( Table III ).

2) Facilitating System Design:

The taxonomy in Table VI can also facilitate choosing an optimal configuration for the target application (e.g. type of dataset, feature extraction algorithm, or ML algorithm) or the defense strategy by ML designers. In this case, designers can find the most robust ML algorithm against a specific type of attack. For instance, the work by Zhang et al. [ 73 ] (generation #815) suggests adding a SVM-based detector to the ML application as an effective defense against a successful gray-box targeted attack on the DNN algorithm applied in Siri for analyzing audio data ( Table III ). As another example, the work by Bulò et al. [ 79 ] (generation #1410) shows SVM classifier with linear kernel has high robustness against white-box attack using gradient descent strategy ( Table III ).

VII. C hallenges and R esearch O pportunities

Based on the provided system models and taxonomies, the following challenges towards securing ML applications against perturbation attacks are extracted:

1) Undiscovered Problems & Research Gaps:

The emergence of wide variety of ML algorithms and attack strategies requires extra effort in handling security issues in CI field. Table VI shows the research gaps, which are needed to be addressed by future research toward securing ML applications.

2) Lack of Standard Evaluation Metrics:

A set of standard metrics for precise evaluation of ML algorithms under attack are required [ 269 ], [ 270 ]. The metrics ensure a certain amount of robustness for ML algorithms facing various attacks, which is vital in deployment for real-world application. The proposed metric (research impact) in this work can be an initial step to coin quantitative standard metrics. For example, the processing time of generating adversarial samples is just as important as the attack success rate, which is often ignored in research.

3) Excessive Attention on Modifying ML Processes:

Most defenses focus on ML algorithms, while rethinking a formal method for extracting new data features for mitigating the attacks is usually ignored. For instance, considering the distance constraints on adversarial samples, finding features that maximize the gap between the source and target classes can lead to an immunity against perturbation attacks. Classification based on new types of features may also defeat the attacks. For example, in facial recognition systems, by extracting features inspired by human visual perception (e.g. simile [ 271 ]), vulnerability to pixel level perturbations can be diminished.

4) Classifier’s Accuracy and Defense Cost Trade-Off:

Typically defense methods degrade the ML accuracy. A possible future trend is to find an analytical solution for modeling the trade-off between the classification accuracy and effectiveness of the defense strategies [ 192 ], [ 272 ].

5) Geometrical Interpretation in Feature Domain:

There are some works that quantitatively analyze the security of ML algorithms from geometrical viewpoint [ 50 ], [ 56 ], [ 57 ], [ 273 ]. However, these studies are limited to low dimensional input data, which are not applicable in higher dimensions (e.g. image data). By knowing the position of each class and class boundaries within the data domain, the constraints on amount of changes in data samples can be set in a way to distance from the target class. Also, analyzing the volume of each class indicates the quality of the classifier in drawing class boundaries and eliminating blind spots. Although these types of geometrical analysis can assist ML designers to extract new features or refine algorithms for robust classification, applying the analysis on high dimensional domains is still intractable.

6) Lack of Focus on Other Types of Classifiers:

One of the reasons for choosing supervised ML algorithms for analysis in this research was the significant amount of attention to them; while, other types of algorithms, such as unsupervised and reinforcement learning , require extensive analysis as well [ 61 ], [ 274 ], [ 275 ]. The proposed approach in this work can also be used to analyze other types of ML algorithms.

7) Insufficient Studies on Practical Scenarios:

There are few works that go beyond theoretical analysis to test practical conditions [ 118 ], [ 276 ], [ 277 ]. For instance, in practical ML-based authentication systems, an adversary may apply different methods to launch a replay attack, such as coercion or using artefact (e.g. gummy finger) [ 186 ]. For mitigating coercion, applying stress or fear detectors (e.g. analyzing facial muscle movement [ 278 ]) and for detecting artefact, liveness detection methods (e.g. perspiration in fingerprinting [ 279 ]) are some defensive strategies. Modeling the adversary’s capability considering these breaches is a hot research topic.

8) Security Solutions Post-Attack:

Predicting all possible attacks during the ML design is not feasible. System recovery after encountering attacks is also an important topic that needs to be addressed. Machine unlearning [ 267 ] as a recovery method after poisoning attacks, in order to reduce its effects on a trained model, is an example of post-attack solutions.

9) Certified Defenses as Guaranteed Solutions:

Certified defenses, such as randomized smoothing [ 32 ], [ 34 ], differential privacy [ 31 ], or constrained adversarial training [ 33 ], determine certain levels of robustness facing attacks, which have constraints on the amount of alteration on the source data. Using such defenses, the adversary has to make more alteration that increases the chance to be spotted by an external observer. However, in many ML applications, such physiological signal based authentication systems [ 5 ], since the detection of the alteration by the observer is not very likely, there is no constraints in generating adversarial samples and certified defenses cannot block the attack.

VIII. C onclusion

This research analyzes perturbation attacks on ML applications by developing system-driven models and taxonomies of the applications, adversary’s strategies, defense mechanisms, and their interactions. The provided models and taxonomies can be used to rank and compare the research works to reveal potential vulnerabilities that have not yet been addressed by the current studies. The taxonomies are independent from a specific type of data, algorithm, or application, which can also be generalized to evaluate and organize the future research. The main aims of creating the set of system-driven taxonomies are twofold: 1) enabling the extension of recent research and 2) discovering open problems and research gaps. This research outcome can benefit: 1) new researchers in the field, to choose appropriate ML applications, attack strategies, or defense mechanisms to address, 2) experienced researchers, for relative comparison of contributions in recent and future research, 3) ML designers, to ensure that the selected ML application to be used has state-of-the-art defense strategies, and 4) standardization organizations [ 186 ], by utilizing the provided systematic terminologies, models, and taxonomies.

Acknowledgments

This work has been partly funded by CNS grant #1218505, IIS grant #1116385, and NIH grant #EB019202.

Biographies

An external file that holds a picture, illustration, etc. Object name is nihms-1621972-b0001.gif

Koosha Sadeghi is a Ph.D. candidate in Computer Engineering at the School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ. He is also working as a digital signal processing and machine learning algorithms engineer at Ciye.co, Albany, California. His current research interests are focused around machine learning, mathematical optimization, and pervasive computing. More details about his research are available at http://kooshasadeghi.weebly.com .

An external file that holds a picture, illustration, etc. Object name is nihms-1621972-b0002.gif

Ayan Banerjee is an Assistant Research Professor in the School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ. He received his PhD degree in Computer Science from Arizona State University and his B.E. degree in Electronics and Telecommunication Engineering from Jadavpur University, Kolkata, India. His research interests include safety and sustainability of cyber-physical systems. He is a member of the IEEE. His publications are available at http://impact.asu.edu/ayan/index.html .

An external file that holds a picture, illustration, etc. Object name is nihms-1621972-b0003.gif

Sandeep K. S. Gupta is the Director of the School of Computing, Informatics, and Decision Systems Engineering (SCIDSE) and a Professor of Computer Science and Engineering, Arizona State University, Tempe, AZ. He is a member of several Graduate Faculties at ASU including Computer Science, Computer Engineering, Electrical Engineering, and Mechanical Engineering. He received the BTech degree in Computer Science and Engineering (CSE) from the Institute of Technology, Banaras Hindu University, Varanasi, India, the M.Tech. degree in CSE from the Indian Institute of Technology, Kanpur, and the MS and PhD degrees in Computer and Information Science from Ohio State University, Columbus, OH. He has served at Duke University, Durham, NC, as a postdoctoral researcher; at Ohio University, Athens, OH, as a Visiting Assistant Professor; and at Colorado State University, Ft. Collins, CO, as an Assistant Professor. His current research is focused on safe, secure and sustainable cyber-physical systems with focus on AI-enabled systems such as Artificial Pancreas and Autonomous Transportation. His research has been funded by the US National Science Foundation (NSF), The National Institute of Health (NIH), Science Foundation of Arizona (SFAz), the Consortium for Embedded Systems (CES), the Intel Corp., Raytheon, Northrop Grumman, and Mediserve Information Systems. Dr. Gupta has published over 150 peer reviewed conference and journal articles (Google Scholar h-index 54) and has advised over 15 PhD and over 25 MS students. He has co-authored two books: Fundamentals of Mobile and Pervasive Computing, McGraw Hill, and Body Sensor Networks: Safety, Security and Sustainability, Cambridge University Press. He currently is or has served on the editorial board of Elsevier Sustainable Computing, IEEE Transactions on Parallel & Distributed System, IEEE Communications Letters and Wireless Networks. Dr. Gupta is a Senior Sustainability Scientist, in the Global Institute of Sustainability, ASU. His awards include a Best 2009 SCIDSE Senior Researcher, a Best Paper Award for Security for Pervasive Health Monitoring Application, and two best paper award nominations. His research has been highlighted on various research news, sites and blogs from various sources including NSF, ACM, ASU, and the Discovery channel. He was TPC Chair of BodyNets 2008 conference and TPC co-chair for Greencom 2013 conference and SI co-editor for IEEE Pervasive, IEEE Transactions on Computers, IEEE Transactions on Knowledge and Data Engineering, and IEEE Proceedings. Dr. Gupta heads the IMPACT Lab ( http://impact.asu.edu ) at Arizona State University.

Footnotes

1 In this paper, the term “ML application” refers to ML-based CI application.

REFERENCES

[1] Tian Y, Pei K, Jana S, and Ray B, “ DeepTest: Automated testing of deep-neural-network-driven autonomous cars ,” arXiv preprint arXiv:1708.08559 , 2017.
[2] Bourzac K, “ Bringing big neural networks to self-driving cars, smart-phones, and drones ,” IEEE spectrum , 2016.
[3] Mnih V et al., “ Human-level control through deep reinforcement learning ,” Nature , vol. 518 , no. 7540 , pp. 529–533, 2015. [ PubMed ] [ Google Scholar ]
[4] Shin H-C et al., “ Deep convolutional neural networks for computer-aided detection: CNN architectures, dataset characteristics and transfer learning ,” IEEE transactions on medical imaging , vol. 35 , no. 5 , pp. 1285–1298, 2016. [ PMC free article ] [ PubMed ] [ Google Scholar ]
[5] Sohankar J, Sadeghi K, Banerjee A, and Gupta SKS, “E-BIAS: A pervasive EEG-based identification and authentication system,” in Q2SWinet . ACM, 2015. [ Google Scholar ]
[6] Sharif M, Bhagavatula S, Bauer L, and Reiter MK, “ Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition ,” in Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2016, pp. 1528–1540. [ Google Scholar ]
[7] Andor D et al., “ Globally normalized transition-based neural networks ,” arXiv preprint arXiv:1603.06042 , 2016.
[8] Najafabadi MM et al., “ Deep learning applications and challenges in big data analytics ,” Journal of Big Data , vol. 2 , no. 1 , p. 1, 2015. [ Google Scholar ]
[9] Heartfield R and Loukas G, “ A taxonomy of attacks and a survey of defence mechanisms for semantic social engineering attacks ,” ACM Computing Surveys (CSUR) , vol. 48 , no. 3 , p. 37, 2016. [ Google Scholar ]
[10] Lowd D and Meek C, “Adversarial learning,” in SIGKDD . ACM, 2005, pp. 641–647. [ Google Scholar ]
[11] Szegedy C et al., “ Intriguing properties of neural networks ,” arXiv preprint arXiv:1312.6199 , 2013.
[12] Goodfellow IJ, Shlens J, and Szegedy C, “ Explaining and harnessing adversarial examples ,” arXiv preprint arXiv:1412.6572 , 2014.
[13] Finlayson SG et al., “ Adversarial attacks on medical machine learning ,” Science , vol. 363 , no. 6433 , pp. 1287–1289, 2019. [ PMC free article ] [ PubMed ] [ Google Scholar ]
[14] McCarty B, “ The honeynet arms race ,” IEEE Security & Privacy , vol. 1 , no. 6 , pp. 79–82, 2003. [ Google Scholar ]
[15] Andrews M, “ Guest editor’s introduction: The state of web security ,” IEEE Security & Privacy , vol. 4 , no. 4 , pp. 14–15, 2006. [ Google Scholar ]
[16] Biggio B, Fumera G, and Roli F, “ Multiple classifier systems for robust classifier design in adversarial environments ,” Intl. Journal of Machine Learning and Cybernetics , vol. 1 , no. 1–4 , pp. 27–41, 2010. [ Google Scholar ]
[17] Biggio B et al., “Evasion attacks against machine learning at test time,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases . Springer, 2013, pp. 387–402. [ Google Scholar ]
[18] Zhang F, Chan PP, Biggio B, Yeung DS, and Roli F, “ Adversarial feature selection against evasion attacks ,” IEEE transactions on cybernetics , vol. 46 , no. 3 , pp. 766–777, 2015. [ PubMed ] [ Google Scholar ]
[19] Maiorca D, Biggio B, and Giacinto G, “ Towards adversarial malware detection: Lessons learned from pdf-based attacks ,” ACM Computing Surveys (CSUR) , vol. 52 , no. 4 , p. 78, 2019. [ Google Scholar ]
[20] Dalvi N, Domingos P, Sanghai S, Verma D et al., “ Adversarial classification ,” in SIGKDD , 2004, pp. 99–108.
[21] Kissel R, Glossary of key information security terms . Diane Publishing, 2011. [ Google Scholar ]
[22] Abadi M et al., “ On the protection of private information in machine learning systems: Two recent approches ,” in Computer Security Foundations Symposium (CSF), 2017 IEEE 30th. IEEE, 2017, pp. 1–6. [ Google Scholar ]
[23] Nguyen A, Yosinski J, and Clune J, “ Deep neural networks are easily fooled: High confidence predictions for unrecognizable images ,” in CVPR , 2015, pp. 427–436.
[24] Alfeld S, Zhu X, and Barford P, “ Data poisoning attacks against autoregressive models ,” in AAAI , 2016, pp. 1452–1458.
[25] Muñoz-González L et al., “ Towards poisoning of deep learning algorithms with back-gradient optimization ,” in Proc. of the 10th ACM Workshop on Artificial Intelligence and Security, 2017, pp. 27–38. [ Google Scholar ]
[26] Dritsoula L, Loiseau P, and Musacchio J, “ A game-theoretic analysis of adversarial classification ,” IEEE Transactions on Information Forensics and Security , vol. 12 , no. 12 , pp. 3094–3109, 2017. [ Google Scholar ]
[27] Stein T, Chen E, and Mangla K, “ Facebook immune system ,” in Proc. of the 4th Workshop on Social Network Systems. ACM, 2011, p. 8. [ Google Scholar ]
[28] Cai H and Venkatasubramanian KK, “ Detecting signal injection attack-based morphological alterations of ECG measurements ,” in DCOSS, IEEE International Conference on, 2016, pp. 127–135. [ Google Scholar ]
[29] Ding D, Han Q-L, Xiang Y, Ge X, and Zhang X-M, “ A survey on security control and attack detection for industrial cyber-physical systems ,” Neurocomputing , vol. 275 , pp. 1674–1683, 2018. [ Google Scholar ]
[30] Tweneboah-Kodua S, Atsu F, and Buchanan W, “ Impact of cyber-attacks on stock performance: a comparative study ,” Information & Computer Security , vol. 26 , no. 5 , pp. 637–652, 2018. [ Google Scholar ]
[31] Lecuyer M, Atlidakis V, Geambasu R, Hsu D, and Jana S, “ Certified robustness to adversarial examples with differential privacy ,” arXiv preprint arXiv:1802.03471 , 2018.
[32] Liu X, Cheng M, Zhang H, and Hsieh C-J, “ Towards robust neural networks via random self-ensemble ,” in ECCV , 2018, pp. 369–385.
[33] Li B, Chen C, Wang W, and Duke LC, “ Certified adversarial robustness with addition gaussian noise ,” IEEE Security & Privacy , 2019.
[34] Cohen JM, Rosenfeld E, and Kolter JZ, “ Certified adversarial robustness via randomized smoothing ,” arXiv preprint arXiv:1902.02918 , 2019.
[35] Rice D, “ The driverless car and the legal system: Hopes and fears as the courts, regulatory agencies, waymo, tesla, and uber deal with this exciting and terrifying new technology ,” Journal of Strategic Innovation and Sustainability , vol. 14 , no. 1 , 2019. [ Google Scholar ]
[36] Paradis C, Kazman R, and Davies MD, “ Towards explaining security defects in complex autonomous aerospace systems ,” in AIAA Scitech 2019 Forum, 2019, p. 0770. [ Google Scholar ]
[37] Carlini N, Athalye A, Papernot N, Brendel W, Rauber J, Tsipras D, Goodfellow I, and Madry A, “ On evaluating adversarial robustness ,” arXiv preprint arXiv:1902.06705 , 2019.
[38] Barreno M, Nelson B, Joseph AD, and Tygar JD, “ The security of machine learning ,” Machine Learning , vol. 81 , no. 2 , pp. 121–148, 2010. [ Google Scholar ]
[39] Gardiner J and Nagaraja S, “ On the security of machine learning in malware c&c detection: A survey ,” ACM Computing Surveys (CSUR) , vol. 49 , no. 3 , p. 59, 2016. [ Google Scholar ]
[40] Biggio B and Roli F, “ Wild patterns: Ten years after the rise of adversarial machine learning ,” arXiv preprint arXiv:1712.03141 , 2017.
[41] Liu Q et al., “ A survey on security threats and defensive techniques of machine learning: A data driven view ,” IEEE Access , 2018.
[42] Zhou Y, Kantarcioglu M, and Xi B, “ A survey of game theoretic approach for adversarial machine learning ,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , p. e1259, 2018.
[43] Akhtar N and Mian A, “ Threat of adversarial attacks on deep learning in computer vision: A survey ,” arXiv preprint arXiv:1801.00553 , 2018.
[44] Ozdag M, “ Adversarial attacks and defenses against deep neural networks: A survey ,” Procedia Computer Science , vol. 140 , pp. 152–161, 2018. [ Google Scholar ]
[45] Xiao Q, Li K, Zhang D, and Xu W, “ Security risks in deep learning implementations ,” in Security and Privacy Workshops. IEEE, 2018, pp. 123–128. [ Google Scholar ]
[46] Khalid F, Hanif MA, Rehman S, and Shafique M, “ Security for machine learning-based systems: Attacks and challenges during training and inference ,” in FIT. IEEE, 2018, pp. 327–332. [ Google Scholar ]
[47] Rouani BD, Samragh M, Javidi T, and Koushanfar F, “ Safe machine learning and defeating adversarial attacks ,” IEEE Security & Privacy , vol. 17 , no. 2 , pp. 31–38, 2019. [ Google Scholar ]
[48] Biggio B, Fumera G, and Roli F, “ Security evaluation of pattern classifiers under attack ,” IEEE Transactions on Knowledge and Data Engineering , vol. 26 , no. 4 , pp. 984–996, 2014. [ Google Scholar ]
[49] Barreno M, Nelson B, Sears R, Joseph AD, and Tygar JD, “ Can machine learning be secure? ” in Proc. of the 2006 ACM Symp. on Information, computer and communications security, 2006, pp. 16–25. [ Google Scholar ]
[50] Laskov P and Kloft M, “ A framework for quantitative security analysis of machine learning ,” in Proceedings of the 2nd ACM workshop on Security and artificial intelligence. ACM, 2009, pp. 1–4. [ Google Scholar ]
[51] Huang L, Joseph AD, Nelson B, Rubinstein BI, and Tygar J, “ Adversarial machine learning ,” in Proceedings of the 4th ACM workshop on Security and artificial intelligence. ACM, 2011, pp. 43–58. [ Google Scholar ]
[52] Kloft M and Laskov P, “ Security analysis of online centroid anomaly detection ,” Journal of Machine Learning Research , vol. 13 , no. Dec , pp. 3681–3724, 2012. [ Google Scholar ]
[53] Laskov P et al., “ Practical evasion of a learning-based classifier: A case study ,” in S&P . IEEE, 2014, pp. 197–211. [ Google Scholar ]
[54] Papernot N et al., “ The limitations of deep learning in adversarial settings ,” in EuroS&P . IEEE, 2016, pp. 372–387. [ Google Scholar ]
[55] Papernot N, McDaniel P, Sinha A, and Wellman MP, “ SoK: Security and privacy in machine learning ,” in 2018 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 2018, pp. 399–414. [ Google Scholar ]
[56] Sadeghi K, Banerjee A, Sohankar J, and Gupta SKS, “ Toward parametric security analysis of machine learning based cyber forensic biometric systems ,” in ICMLA. IEEE, 2016, pp. 626–631. [ Google Scholar ]
[57] Sadeghi K, “ Geometrical analysis of machine learning security in biometric authentication systems ,” in ICMLA. IEEE, 2017, pp. 309–314. [ Google Scholar ]
[58] Li B and Vorobeychik Y, “ Evasion-robust classification on binary domains ,” ACM Transactions on Knowledge Discovery from Data (TKDD) , vol. 12 , no. 4 , p. 50, 2018. [ Google Scholar ]
[59] Frederickson C, Moore M, Dawson G, and Polikar R, “ Attack strength vs. detectability dilemma in adversarial machine learning ,” arXiv preprint arXiv:1802.07295 , 2018.
[60] Wang X, Li J, Kuang X, Tan Y.-a., and Li J, “ The security of machine learning in an adversarial setting: A survey ,” Journal of Parallel and Distributed Computing , 2019.
[61] Chen T, Liu J, Xiang Y, Niu W, Tong E, and Han Z, “ Adversarial attack and defense in reinforcement learning-from AI security view ,” Cybersecurity , vol. 2 , no. 1 , p. 11, 2019. [ Google Scholar ]
[62] Ling X et al., “ Deepsec: A uniform platform for security analysis of deep learning model ,” in IEEE S&P , 2019.
[63] Izmailov R, Sugrim S, Chadha R, McDaniel P, and Swami A, “Enablers of adversarial attacks in machine learning,” in MILCOM . IEEE, 2018, pp. 425–430. [ Google Scholar ]
[64] Nguyen L, Wang S, and Sinha A, “ A learning and masking approach to secure learning ,” in International Conference on Decision and Game Theory for Security. Springer, 2018, pp. 453–464. [ Google Scholar ]
[65] Sethi TS, Kantardzic M, Lyu L, and Chen J, “ A dynamic-adversarial mining approach to the security of machine learning ,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery , vol. 8 , no. 3 , p. e1245, 2018. [ Google Scholar ]
[66] Yu Y, Liu X, and Chen Z, “ Attacks and defenses towards machine learning based systems ,” in Proc. of the 2nd Intl. Conf. on Computer Science and Application Engineering. ACM, 2018, p. 175. [ Google Scholar ]
[67] Truong A, Kiyavash N, and Etesami SR, “ Adversarial machine learning: The case of recommendation systems ,” in 2018 IEEE 19th International Workshop on SPAWC, 2018, pp. 1–5. [ Google Scholar ]
[68] Yuan X, He P, Zhu Q, and Li X, “ Adversarial examples: Attacks and defenses for deep learning ,” IEEE transactions on neural networks and learning systems , 2019. [ PubMed ]
[69] Smutz C and Stavrou A, “ Malicious pdf detection using metadata and structural features ,” in Proceedings of the 28th annual computer security applications conference. ACM, 2012, pp. 239–248. [ Google Scholar ]
[70] Biggio B et al., “ One-and-a-half-class multiple classifier systems for secure learning against evasion attacks at test time ,” in Intl. Workshop on Multiple Classifier Systems. Springer, 2015, pp. 168–180. [ Google Scholar ]
[71] Zhang F, Chan PP, Biggio B, Yeung DS, and Roli F, “ Adversarial feature selection against evasion attacks ,” IEEE transactions on cybernetics , vol. 46 , no. 3 , pp. 766–777, 2016. [ PubMed ] [ Google Scholar ]
[72] Song C and Shmatikov V, “ Fooling OCR systems with adversarial text images ,” arXiv preprint arXiv:1802.05385 , 2018.
[73] Zhang G et al., “ Dolphinattack: Inaudible voice commands ,” in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2017, pp. 103–117. [ Google Scholar ]
[74] Kurakin A, Goodfellow I, and Bengio S, “ Adversarial examples in the physical world ,” arXiv preprint arXiv:1607.02533 , 2016.
[75] Yang D, Xiao C, Li B, Deng J, and Liu M, “ Realistic adversarial examples in 3d meshes ,” arXiv preprint arXiv:1810.05206 , 2018.
[76] Huang S, Papernot N, Goodfellow I, Duan Y, and Abbeel P, “ Adversarial attacks on neural network policies ,” arXiv preprint arXiv:1702.02284 , 2017.
[77] Graese A, Roza A, and Boult TE, “ Assessing threat of adversarial examples on deep neural networks ,” in ICMLA. IEEE, 2016, pp. 69–74. [ Google Scholar ]
[78] Papernot N, McDaniel P, Wu X, Jha S, and Swami A, “ Distillation as a defense to adversarial perturbations against deep neural networks ,” in S&P. IEEE, 2016, pp. 582–597. [ Google Scholar ]
[79] Bulò SR, Biggio B, Pillai I, Pelillo M, and Roli F, “ Randomized prediction games for adversarial machine learning ,” IEEE transactions on neural networks and learning systems , vol. 28 , no. 11 , pp. 2466–78, 2017. [ PubMed ] [ Google Scholar ]
[80] Cao X and Gong NZ, “ Mitigating evasion attacks to deep neural networks via region-based classification ,” arXiv preprint arXiv:1709.05583 , 2017.
[81] Carlini N and Wagner D, “ Towards evaluating the robustness of neural networks ,” in S&P , 2017, pp. 39–57.
[82] Demontis A et al., “ Yes, machine learning can be more secure! a case study on android malware detection ,” IEEE Transactions on Dependable and Secure Computing , 2017.
[83] Osadchy M, Hernandez-Castro J, Gibson S, Dunkelman O, and Pérez-Cabo D, “ No bot expects the DeepCAPTCHA! introducing immutable adversarial examples, with applications to CAPTCHA generation ,” IEEE Transactions on Information Forensics and Security , vol. 12 , no. 11 , pp. 2640–2653, 2017. [ Google Scholar ]
[84] Papernot N et al., “ Practical black-box attacks against machine learning ,” in Proceedings of the 2017 ACM on Asia Conference on Computer and Communications Security. ACM, 2017, pp. 506–519. [ Google Scholar ]
[85] Xu W, Evans D, and Qi Y, “ Feature squeezing: Detecting adversarial examples in deep neural networks ,” arXiv preprint arXiv:1704.01155 , 2017.
[86] Baluja S and Fischer I, “ Learning to attack: Adversarial transformation networks ,” in Proceedings of AAAI-2018, 2018. [ Google Scholar ]
[87] Fisher III JW and Principe JC, “ A methodology for information theoretic feature extraction ,” in Neural Networks Proceedings, 1998. IEEE World Congress on Computational Intelligence. The International Joint Conference on, vol. 3, 1998, pp. 1712–16. [ Google Scholar ]
[88] Mitchell TM, “Machine learning. 1997,” Burr Ridge , IL: McGraw Hill, vol. 45 , no. 37, pp. 870–877, 1997. [ Google Scholar ]
[89] Zhao W, Long J, Yin J, Cai Z, and Xia G, “ Sampling attack against active learning in adversarial environment ,” in Intl. Conf. on Modeling Decisions for Artificial Intelligence. Springer, 2012, pp. 222–233. [ Google Scholar ]
[90] Tong L, Li B, Hajaj C, and Vorobeychik Y, “ Feature conservation in adversarial classifier evasion: A case study ,” arXiv preprint arXiv:1708.08327 , 2017.
[91] Baluja S and Fischer I, “ Adversarial transformation networks: Learning to generate adversarial examples ,” arXiv preprint arXiv:1703.09387 , 2017.
[92] Papernot N, McDaniel P, and Goodfellow I, “ Transferability in machine learning: from phenomena to black-box attacks using adversarial samples ,” arXiv preprint arXiv:1605.07277 , 2016.
[93] Shokri R, Stronati M, Song C, and Shmatikov V, “ Membership inference attacks against machine learning models ,” in S&P. IEEE, 2017, pp. 3–18. [ Google Scholar ]
[94] Rosenberg I, Shabtai A, Rokach L, and Elovici Y, “ Generic black-box end-to-end attack against state of the art API call based malware classifiers ,” in International Symposium on Research in Attacks, Intrusions, and Defenses. Springer, 2018, pp. 490–510. [ Google Scholar ]
[95] Zhou Y, Kantarcioglu M, Thuraisingham B, and Xi B, “ Adversarial support vector machine learning ,” in Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2012, pp. 1059–1067. [ Google Scholar ]
[96] Russu P, Demontis A, Biggio B, Fumera G, and Roli F, “ Secure kernel machines against evasion attacks ,” in Proceedings of the Workshop on Artificial Intelligence and Security. ACM, 2016, pp. 59–69. [ Google Scholar ]
[97] Sadeghi K, Sohankar J, Banerjee A, and Gupta SK, “ A novel spoofing attack against electroencephalogram-based security systems ,” in 2017 IEEE SmartWorld. IEEE, 2017, pp. 1–6. [ Google Scholar ]
[98] Hinton G, Vinyals O, and Dean J, “ Distilling the knowledge in a neural network ,” arXiv preprint arXiv:1503.02531 , 2015.
[99] Saini U, “ Machine learning in the presence of an adversary: Attacking and defending the spambayes spam filter ,” California Univ. Berkeley Dept. of Electrical Engineering and Computer Science, Tech. Rep ., 2008.
[100] Eykholt K et al., “ Robust physical-world attacks on deep learning visual classification ,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 1625–1634. [ Google Scholar ]
[101] Ilyas A, Engstrom L, Athalye A, and Lin J, “ Black-box adversarial attacks with limited queries and information ,” arXiv preprint arXiv:1804.08598 , 2018.
[102] Newsome J, Karp B, and Song D, “ Paragraph: Thwarting signature learning by training maliciously ,” in International Workshop on Recent Advances in Intrusion Detection . Springer, 2006, pp. 81–105. [ Google Scholar ]
[103] Xu W, Qi Y, and Evans D, “ Automatically evading classifiers ,” in Proc. of the 2016 Network and Distributed Systems Symp., 2016. [ Google Scholar ]
[104] Hayes J and Danezis G, “ Learning universal adversarial perturbations with generative models ,” in Security and Privacy Workshops. IEEE, 2018, pp. 43–49. [ Google Scholar ]
[105] Bhagoji AN, He W, Li B, and Song D, “ Exploring the space of black-box attacks on deep neural networks ,” arXiv preprint arXiv:1712.09491 , 2017.
[106] Bhagoji AN, “ Practical black-box attacks on deep neural networks using efficient query mechanisms ,” in European Conference on Computer Vision. Springer, 2018, pp. 158–174. [ Google Scholar ]
[107] Spall JC, Introduction to stochastic search and optimization: estimation, simulation, and control . John Wiley & Sons, 2005, vol. 65 . [ Google Scholar ]
[108] Chen P-Y, Zhang H, Sharma Y, Yi J, and Hsieh C-J, “ Zoo: Zeroth order optimization based black-box attacks to deep neural networks without training substitute models ,” in Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, 2017, pp. 15–26. [ Google Scholar ]
[109] Narodytska N and Kasiviswanathan S, “ Simple black-box adversarial attacks on deep neural networks ,” in CVPRW. IEEE, 2017, pp. 1310–1318. [ Google Scholar ]
[110] Maiorca D, Ariu D, Corona I, Aresu M, and Giacinto G, “ Stealth attacks: An extended insight into the obfuscation effects on android malware ,” Computers & Security , vol. 51 , pp. 16–31, 2015. [ Google Scholar ]
[111] Gomez-Barrero M, Galbally J, Fierrez J, and Ortega-Garcia J, “ Face verification put to test: A hill-climbing attack based on the uphill-simplex algorithm ,” in Biometrics (ICB), 2012 5th IAPR International Conference on. IEEE, 2012, pp. 40–45. [ Google Scholar ]
[112] Maiorana E, Hine GE, La Rocca D, and Campisi P, “ On the vulnerability of an EEG-based biometric system to hill-climbing attacks algorithms’ comparison and possible countermeasures ,” in BTAS, 2013 IEEE Sixth International Conference on. IEEE, 2013, pp. 1–6. [ Google Scholar ]
[113] Dang H, Huang Y, and Chang E-C, “ Evading classifiers by morphing in the dark ,” in Proceedings of the ACM SIGSAC Conference on Computer and Communications Security, 2017, pp. 119–133. [ Google Scholar ]
[114] Gao J, Lanchantin J, Soffa ML, and Qi Y, “ Black-box generation of adversarial text sequences to evade deep learning classifiers ,” in Security and Privacy Workshops. IEEE, 2018, pp. 50–56. [ Google Scholar ]
[115] Kos J, Fischer I, and Song D, “ Adversarial examples for generative models ,” in Security and Privacy Workshops. IEEE, 2018, pp. 36–42. [ Google Scholar ]
[116] Moosavi-Dezfooli S-M, Fawzi A, and Frossard P, “ Deepfool: a simple and accurate method to fool deep neural networks ,” in CVPR. IEEE, 2016, pp. 2574–2582. [ Google Scholar ]
[117] Chen P-Y, Sharma Y, Zhang H, Yi J, and Hsieh C-J, “ EAD: elastic-net attacks to deep neural networks via adversarial examples ,” in Thirty-second AAAI conference on artificial intelligence, 2018. [ Google Scholar ]
[118] Evtimov I et al., “ Robust physical-world attacks on machine learning models ,” arXiv preprint arXiv:1707.08945 , 2017.
[119] Kolosnjaji B et al., “ Adversarial malware binaries: Evading deep learning for malware detection in executables ,” arXiv preprint arXiv:1803.04173 , 2018.
[120] Xie C et al., “ Adversarial examples for semantic segmentation and object detection ,” arXiv preprint arXiv:1703.08603 , 2017.
[121] Papernot N, McDaniel P, Swami A, and Harang R, “ Crafting adversarial input sequences for recurrent neural networks ,” in MILCOM. IEEE, 2016, pp. 49–54. [ Google Scholar ]
[122] Simonyan K, Vedaldi A, and Zisserman A, “ Deep inside convolutional networks: Visualising image classification models and saliency maps ,” arXiv preprint arXiv:1312.6034 , 2013.
[123] Carlini N and Wagner D, “ Defensive distillation is not robust to adversarial examples ,” arXiv preprint arXiv:1607.04311 , 2016.
[124] Evtimov I et al., “ Robust physical-world attacks on deep learning models ,” in Computer Vision and Pattern Recognition , 2018.
[125] Song D et al., “ Physical adversarial examples for object detectors ,” in 12th USENIX Workshop on Offensive Technologies (WOOT 18), 2018. [ Google Scholar ]
[126] Grosse K, Papernot N, Manoharan P, Backes M, and McDaniel P, “ Adversarial examples for malware detection ,” in European Symposium on Research in Computer Security. Springer, 2017, pp. 62–79. [ Google Scholar ]
[127] Güera D et al., “ A counter-forensic method for CNN-based camera model identification ,” in CVPRW. IEEE, 2017, pp. 1840–1847. [ Google Scholar ]
[128] Kurakin A, Goodfellow I, and Bengio S, “ Adversarial machine learning at scale ,” arXiv preprint arXiv:1611.01236 , 2016.
[129] Madry A, Makelov A, Schmidt L, Tsipras D, and Vladu A, “ Towards deep learning models resistant to adversarial attacks ,” arXiv preprint arXiv:1706.06083 , 2017.
[130] Tramèr F, Kurakin A, Papernot N, Boneh D, and McDaniel P, “ Ensemble adversarial training: Attacks and defenses ,” arXiv preprint arXiv:1705.07204 , 2017.
[131] Dong Y et al., “ Boosting adversarial attacks with momentum ,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 9185–9193. [ Google Scholar ]
[132] Sutskever I, Martens J, Dahl G, and Hinton G, “ On the importance of initialization and momentum in deep learning ,” in International conference on machine learning, 2013, pp. 1139–1147. [ Google Scholar ]
[133] Moosavi-Dezfooli S-M, Fawzi A, Fawzi O, and Frossard P, “ Universal adversarial perturbations ,” in CVPR , 2017, pp. 1765–1773.
[134] Yu J, Vishwanathan S, Günter S, and Schraudolph NN, “ A quasi-newton approach to nonsmooth convex optimization problems in machine learning ,” Journal of Machine Learning Research , vol. 11 , no. Mar , pp. 1145–1200, 2010. [ Google Scholar ]
[135] Dai Y-H, “ A perfect example for the BFGS method ,” Mathematical Programming , pp. 1–30, 2013.
[136] Byrd RH, Lu P, Nocedal J, and Zhu C, “ A limited memory algorithm for bound constrained optimization ,” SIAM Journal on Scientific Computing , vol. 16 , no. 5 , pp. 1190–1208, 1995. [ Google Scholar ]
[137] Sabour S, Cao Y, Faghri F, and Fleet DJ, “ Adversarial manipulation of deep representations ,” arXiv preprint arXiv:1511.05122 , 2015.
[138] Tabacof P and Valle E, “ Exploring the space of adversarial images ,” in IJCNN. IEEE, 2016, pp. 426–433. [ Google Scholar ]
[139] Xiao C et al., “ Spatially transformed adversarial examples ,” arXiv preprint arXiv:1801.02612 , 2018.
[140] Wong E, Schmidt FR, and Kolter JZ, “ Wasserstein adversarial examples via projected sinkhorn iterations ,” arXiv preprint arXiv:1902.07906 , 2019.
[141] Boyd S and Vandenberghe L, Convex optimization . Cambridge university press, 2004. [ Google Scholar ]
[142] Goodfellow I et al., “ Generative adversarial nets ,” in Advances in neural information processing systems , 2014, pp. 2672–2680.
[143] Goodfellow IJ, “ On distinguishability criteria for estimating generative models ,” arXiv preprint arXiv:1412.6515 , 2014.
[144] Radford A, Metz L, and Chintala S, “ Unsupervised representation learning with deep convolutional generative adversarial networks ,” arXiv preprint arXiv:1511.06434 , 2015.
[145] Xiao C et al., “ Generating adversarial examples with adversarial networks ,” arXiv preprint arXiv:1801.02610 , 2018.
[146] Oh SJ, Fritz M, and Schiele B, “ Adversarial image perturbation for privacy protection a game theory perspective ,” in ICCV. IEEE, 2017, pp. 1491–1500. [ Google Scholar ]
[147] Chivukula AS and Liu W, “ Adversarial learning games with deep learning models ,” in IJCNN. IEEE, 2017, pp. 2758–2767. [ Google Scholar ]
[148] Cun L et al., “ Comparison of learning algorithms for handwritten digit recognition ,” in Proc 1st Int Conf on Artificial Neural Networks, Sofia, Bulgaria, 1995, pp. 53–60. [ Google Scholar ]
[149] Goodfellow IJ, Bulatov Y, Ibarz J, Arnoud S, and Shet V, “ Multi-digit number recognition from street view imagery using deep convolutional neural networks ,” arXiv preprint arXiv:1312.6082 , 2013.
[150] http://contagiodump.blogspot.it , accessed: 2019-05-29.
[151] Rozsa A, Rudd EM, and Boult TE, “ Adversarial diversity and hard positive generation ,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016, pp. 25–32. [ Google Scholar ]
[152] Krizhevsky A and Hinton G, “ Learning multiple layers of features from tiny images ,” Technical report, University of Toronto , 2009.
[153] Chang C-C and Lin C-J, “ LIBSVM: a library for support vector machines ,” ACM transactions on intelligent systems and technology (TIST) , vol. 2 , no. 3 , p. 27, 2011. [ Google Scholar ]
[154] Cormack GV, “ TREC 2007 spam track overview ,” in In The Sixteenth Text REtrieval Conference (TREC 2007) Proceedings, 2007. [ Google Scholar ]
[155] https://goo.gl/mEX7By , accessed: 2019-05-29.
[156] Deng J et al., “ Imagenet: A large-scale hierarchical image database ,” in CVPR. IEEE, 2009, pp. 248–255. [ Google Scholar ]
[157] Szegedy C, Vanhoucke V, Ioffe S, Shlens J, and Wojna Z, “ Rethinking the inception architecture for computer vision ,” in CVPR , 2016, pp. 2818–2826.
[158] Arp D et al., “ Drebin: Effective and explainable detection of android malware in your pocket ,” in NDSS , vol. 14 , 2014, pp. 23–26. [ Google Scholar ]
[159] Kołcz A and Teo CH, “ Feature weighting for improved classifier robustness ,” in CEAS’09: 6th conf. on email and anti-spam, 2009. [ Google Scholar ]
[160] Russakovsky O et al., “ ImageNet Large Scale Visual Recognition Challenge ,” International Journal of Computer Vision (IJCV) , vol. 115 , no. 3 , pp. 211–252, 2015. [ Google Scholar ]
[161] Chatfield K, Simonyan K, Vedaldi A, and Zisserman A, “ Return of the devil in the details: Delving deep into convolutional nets ,” arXiv preprint arXiv:1405.3531 , 2014.
[162] Zeiler MD and Fergus R, “ Visualizing and understanding convolutional networks ,” in European conference on computer vision. Springer, 2014, pp. 818–833. [ Google Scholar ]
[163] Sermanet P et al., “ Overfeat: Integrated recognition, localization and detection using convolutional networks ,” arXiv preprint arXiv:1312.6229 , 2013.
[164] Krizhevsky A, Sutskever I, and Hinton GE, “ Imagenet classification with deep convolutional neural networks ,” in Advances in neural information processing systems , 2012, pp. 1097–1105. [ Google Scholar ]
[165] http://www.metamind.io/ , accessed: 2018-01-02.
[166] Stallkamp J, Schlipsing M, Salmen J, and Igel C, “ Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition ,” Neural networks , vol. 32 , pp. 323–332, 2012. [ PubMed ] [ Google Scholar ]
[167] http://aws.amazon.com/machine-learning , accessed: 2019-05-29.
[168] http://cloud.google.com/prediction/ , accessed: 2018-01-02.
[169] Papernot N, Goodfellow I, Sheatsley R, Feinman R, and McDaniel P, “ cleverhans v1. 0.0: an adversarial machine learning library ,” arXiv preprint arXiv:1610.00768 , 2016.
[170] Huang G, Liu Z, Weinberger KQ, and van der Maaten L, “ Densely connected convolutional networks ,” in Proc. of the IEEE conference on computer vision and pattern recognition , vol. 1 , no. 2 , 2017, p. 3. [ Google Scholar ]
[171] Howard AG et al., “ Mobilenets: Efficient convolutional neural networks for mobile vision applications ,” arXiv preprint arXiv:1704.04861 , 2017.
[172] Capes T et al., “ Siri on-device deep learning-guided unit selection text-to-speech system ,” Proc. Interspeech 2017, pp. 4011–4015, 2017. [ Google Scholar ]
[173] Szegedy C, Ioffe S, Vanhoucke V, and Alemi AA, “ Inception-v4, Inception-ResNet and the impact of residual connections on learning ,” in AAAI , 2017, pp. 4278–4284.
[174] Carlini N et al., “ Hidden voice commands ,” in USENIX Security Symposium, 2016, pp. 513–530. [ Google Scholar ]
[175] Ateniese G et al., “ Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers ,” International Journal of Security and Networks , vol. 10 , no. 3 , pp. 137–150, 2015. [ Google Scholar ]
[176] Tramèr F, Zhang F, Juels A, Reiter MK, and Ristenpart T, “ Stealing machine learning models via prediction APIs ,” in USENIX Security Symposium, 2016, pp. 601–618. [ Google Scholar ]
[177] Fredrikson M et al., “ Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing ,” in USENIX Security Symposium, 2014, pp. 17–32. [ PMC free article ] [ PubMed ] [ Google Scholar ]
[178] Biggio B, Didaci L, Fumera G, and Roli F, “ Poisoning attacks to compromise face templates ,” in Biometrics (ICB), 2013 International Conference on. IEEE, 2013, pp. 1–7. [ Google Scholar ]
[179] Jia Q, Guo L, Jin Z, and Fang Y, “ Preserving model privacy for machine learning in distributed systems ,” IEEE Transactions on Parallel and Distributed Systems , 2018.
[180] Alabdulmohsin IM, Gao X, and Zhang X, “ Adding robustness to support vector machines against adversarial reverse engineering ,” in Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, 2014, pp. 231–240. [ Google Scholar ]
[181] Shi Y, Sagduyu Y, and Grushin A, “ How to steal a machine learning classifier with deep learning ,” in Technologies for Homeland Security (HST), 2017 IEEE International Symposium on. IEEE, 2017, pp. 1–5. [ Google Scholar ]
[182] Sakkis G et al., “ A memory-based approach to anti-spam filtering for mailing lists ,” Information retrieval , vol. 6 , no. 1 , pp. 49–73, 2003. [ Google Scholar ]
[183] Jain AK, Ross A, and Pankanti S, “ Biometrics: a tool for information security ,” Information Forensics and Security, IEEE Transactions on , vol. 1 , no. 2 , pp. 125–143, 2006. [ Google Scholar ]
[184] Kune DF et al., “ Ghost talk: Mitigating EMI signal injection attacks against analog sensors ,” in S&P. IEEE, 2013, pp. 145–159. [ Google Scholar ]
[185] Eberz S et al., “ Broken hearted: How to attack ECG biometrics ,” Proc. of the 2017 Network and Distributed Systems Symposium, 2017. [ Google Scholar ]
[186] Information technology - Biometric presentation attack detection - Part 1:Framework, ISO/IEC 30107–1 . International Oraganization for Standardization (ISO), 2016. [ Google Scholar ]
[187] Gui Q, Yang W, Jin Z, Ruiz-Blondet MV, and Laszlo S, “ A residual feature-based replay attack detection approach for brainprint biometric systems ,” in Information Forensics and Security (WIFS), 2016 IEEE International Workshop on. IEEE, 2016, pp. 1–6. [ Google Scholar ]
[188] Wu Z and Li H, “ Voice conversion and spoofing attack on speaker verification systems ,” in Signal and Information Processing Association Annual Summit and Conf., 2013 Asia-Pacific. IEEE, 2013, pp. 1–9. [ Google Scholar ]
[189] Grosse K, Papernot N, Manoharan P, Backes M, and McDaniel P, “ Adversarial perturbations against deep neural networks for malware classification ,” arXiv preprint arXiv:1606.04435 , 2016.
[190] Carlini N and Wagner D, “ Audio adversarial examples: Targeted attacks on speech-to-text ,” arXiv preprint arXiv:1801.01944 , 2018.
[191] Chen L, Ye Y, and Bourlai T, “ Adversarial machine learning in malware detection: Arms race between evasion attack and defense ,” in EISIC. IEEE, 2017, pp. 99–106. [ Google Scholar ]
[192] Sadeghi K, Banerjee A, Sohankar J, and Gupta SKS, “ Performance and security strength trade-off in machine learning based biometric authentication systems ,” in ICMLA. IEEE, 2017, pp. 1045–1048. [ Google Scholar ]
[193] Tong L, Yu S, Alfeld S, and Vorobeychik Y, “ Adversarial regression with multiple learners ,” arXiv preprint arXiv:1806.02256 , 2018.
[194] Yu S, Vorobeychik Y, and Alfeld S, “ Adversarial classification on social networks ,” arXiv preprint arXiv:1801.08159 , 2018.
[195] Brückner M and Scheffer T, “ Stackelberg games for adversarial prediction problems ,” in SIGKDD , 2011, pp. 547–555.
[196] Bruckner M, Kanzow C, and Scheffer T, “ Static prediction games for adversarial learning problems ,” Journal of Machine Learning Research , vol. 13 , no. Sep , pp. 2617–2654, 2012. [ Google Scholar ]
[197] Scutari G, Palomar DP, Facchinei F, and Pang J.-s., “ Convex optimization, game theory, and variational inequality theory ,” IEEE Signal Processing Magazine , vol. 27 , no. 3 , pp. 35–49, 2010. [ Google Scholar ]
[198] Dritsoula L, Loiseau P, and Musacchio J, “ A game-theoretic analysis of adversarial classification ,” Information Forensics and Security, IEEE Transactions on , vol. 12 , no. 12 , pp. 3094–3109, 2017. [ Google Scholar ]
[199] Athalye A and Sutskever I, “ Synthesizing robust adversarial examples ,” arXiv preprint arXiv:1707.07397 , 2017.
[200] Das N et al., “ Keeping the bad guys out: Protecting and vaccinating deep learning with jpeg compression ,” arXiv preprint arXiv:1705.02900 , 2017.
[201] Bhagoji AN, Cullina D, Sitawarin C, and Mittal P, “ Enhancing robustness of machine learning systems via data transformations ,” in CISS. IEEE, 2018, pp. 1–5. [ Google Scholar ]
[202] Lu J, Sibai H, Fabry E, and Forsyth D, “ No need to worry about adversarial examples in object detection in autonomous vehicles ,” arXiv preprint arXiv:1707.03501 , 2017.
[203] Bhagoji AN, Cullina D, and Mittal P, “ Dimensionality reduction as a defense against evasion attacks on machine learning classifiers ,” arXiv preprint arXiv:1704.02654 , 2017.
[204] Lu J, Sibai H, Fabry E, and Forsyth D, “ Standard detectors aren’t (currently) fooled by physical adversarial stop signs ,” arXiv preprint arXiv:1710.03337 , 2017.
[205] Guo C, Rana M, Cisse M, and van der Maaten L, “ Countering adversarial images using input transformations ,” arXiv preprint arXiv:1711.00117 , 2017.
[206] Xie C, Wang J, Zhang Z, Ren Z, and Yuille A, “ Mitigating adversarial effects through randomization ,” arXiv preprint arXiv:1711.01991 , 2017.
[207] Meng D and Chen H, “ MagNet: a two-pronged defense against adversarial examples ,” arXiv preprint arXiv:1705.09064 , 2017.
[208] Song Y, Kim T, Nowozin S, Ermon S, and Kushman N, “ Pixeldefend: Leveraging generative models to understand and defend against adversarial examples ,” arXiv preprint arXiv:1710.10766 , 2017.
[209] Huang R, Xu B, Schuurmans D, and Szepesvári C, “ Learning with a strong adversary ,” arXiv preprint arXiv:1511.03034 , 2015.
[210] Kantchelian A, Tygar J, and Joseph A, “ Evasion and hardening of tree ensemble classifiers ,” in International Conference on Machine Learning, 2016, pp. 2387–2396. [ Google Scholar ]
[211] Carlini N, Katz G, Barrett C, and Dill DL, “ Provably minimally-distorted adversarial examples ,” arXiv preprint arXiv:1709.10207 , 2017.
[212] Hosseini H, Chen Y, Kannan S, Zhang B, and Poovendran R, “ Blocking transferability of adversarial examples in black-box learning systems ,” arXiv preprint arXiv:1703.04318 , 2017.
[213] Al-Dujaili A, Huang A, Hemberg E, and O’Reilly U-M, “ Adversarial deep learning for robust detection of binary encoded malware ,” in Security and Privacy Workshops. IEEE, 2018, pp. 76–82. [ Google Scholar ]
[214] Lee H, Han S, and Lee J, “ Generative adversarial trainer: Defense to adversarial perturbations with GAN ,” arXiv preprint arXiv:1705.03387 , 2017.
[215] Bai W, Quan C, and Luo Z, “ Alleviating adversarial attacks via convolutional autoencoder ,” in SNPD, 2017 18th IEEE/ACIS International Conference on. IEEE, 2017, pp. 53–58. [ Google Scholar ]
[216] Rozsa A, Gunther M, and Boult TE, “ Towards robust deep neural networks with bang ,” arXiv preprint arXiv:1612.00138 , 2016.
[217] Tramèr F, Papernot N, Goodfellow I, Boneh D, and McDaniel P, “ The space of transferable adversarial examples ,” arXiv preprint arXiv:1704.03453 , 2017.
[218] Cai Q-Z, Du M, Liu C, and Song D, “ Curriculum adversarial training ,” arXiv preprint arXiv:1805.04807 , 2018.
[219] Wong E and Kolter JZ, “ Provable defenses against adversarial examples via the convex outer adversarial polytope ,” arXiv preprint arXiv:1711.00851 , 2017.
[220] Wong E, Schmidt F, Metzen JH, and Kolter JZ, “ Scaling provable adversarial defenses ,” in Advances in Neural Information Processing Systems , 2018, pp. 8400–8409.
[221] Raghunathan A, Steinhardt J, and Liang PS, “ Semidefinite relaxations for certifying robustness to adversarial examples ,” in Advances in Neural Information Processing Systems , 2018, pp. 10 877–10 887.
[222] Gao J, Wang B, Lin Z, Xu W, and Qi Y, “ DeepCloak: Masking deep neural network models for robustness against adversarial samples ,” ICLR Workshop track, 2017. [ Google Scholar ]
[223] Wang Q et al., “ Adversary resistant deep neural networks with an application to malware detection ,” in SIGKDD. ACM, 2017, pp. 1145–1153. [ Google Scholar ]
[224] Luo Y, Boix X, Roig G, Poggio T, and Zhao Q, “ Foveation-based mechanisms alleviate adversarial examples ,” arXiv preprint arXiv:1511.06292 , 2015.
[225] Liu Q et al., “ Security analysis and enhancement of model compressed deep learning systems under adversarial attacks ,” in Proceedings of the 23rd Asia and South Pacific Design Automation Conference. IEEE Press, 2018, pp. 721–726. [ Google Scholar ]
[226] Liao F et al., “ Defense against adversarial attacks using high-level representation guided denoiser ,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 1778–1787. [ Google Scholar ]
[227] Ross AS and Doshi-Velez F, “ Improving the adversarial robustness and interpretability of deep neural networks by regularizing their input gradients ,” in Thirty-second AAAI conf. on artificial intelligence, 2018. [ Google Scholar ]
[228] Drucker H and Le Cun Y, “ Improving generalization performance using double backpropagation ,” IEEE Transactions on Neural Networks , vol. 3 , no. 6 , pp. 991–997, 1992. [ PubMed ] [ Google Scholar ]
[229] Zheng S, Song Y, Leung T, and Goodfellow I, “ Improving the robustness of deep neural networks via stability training ,” in CVPR. IEEE, 2016, pp. 4480–88. [ Google Scholar ]
[230] Rakin AS, Yi J, Gong B, and Fan D, “ Defend deep neural networks against adversarial examples via fixed and dynamic quantized activation functions ,” arXiv preprint arXiv:1807.06714 , 2018.
[231] Kantchelian A et al., “ Large-margin convex polytope machine ,” in Advances in Neural Information Processing Systems, 2014, pp. 3248–3256. [ Google Scholar ]
[232] Kardan N and Stanley KO, “ Mitigating fooling with competitive overcomplete output layer neural networks ,” in Neural Networks, 2017 Intl. Joint Conference on. IEEE, 2017, pp. 518–525. [ Google Scholar ]
[233] Teo CH, Globerson A, Roweis ST, and Smola AJ, “ Convex learning with invariances ,” in Advances in neural information processing systems, 2008, pp. 1489–1496. [ Google Scholar ]
[234] Gu S and Rigazio L, “ Towards deep neural network architectures robust to adversarial examples ,” arXiv preprint arXiv:1412.5068 , 2014.
[235] Demontis A, Battista B, Fumera G, Giacinto G, and Roli F, “ Infinity-norm support vector machines against adversarial label contamination ,” in ITASEC17, 2017, pp. 106–115. [ Google Scholar ]
[236] Globerson A and Roweis S, “ Nightmare at test time: robust learning by feature deletion ,” in Proceedings of the 23rd international conference on Machine learning. ACM, 2006, pp. 353–360. [ Google Scholar ]
[237] Liew SS, Khalil-Hani M, and Bakhteri R, “ Bounded activation functions for enhanced training stability of deep neural networks on visual pattern recognition problems ,” Neurocomputing , vol. 216 , pp. 718–734, 2016. [ Google Scholar ]
[238] Chan PP, Lin Z, Hu X, Tsang EC, and Yeung DS, “ Sensitivity based robust learning for stacked autoencoder against evasion attack ,” Neurocomputing , vol. 267 , pp. 572–580, 2017. [ Google Scholar ]
[239] Zantedeschi V, Nicolae M-I, and Rawat A, “ Efficient defenses against adversarial attacks ,” in Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security. ACM, 2017, pp. 39–49. [ Google Scholar ]
[240] Fawzi A, Fawzi O, and Frossard P, “ Analysis of classifiers’ robustness to adversarial perturbations ,” arXiv preprint arXiv:1502.02590 , 2015.
[241] Hein M and Andriushchenko M, “ Formal guarantees on the robustness of a classifier against adversarial manipulation ,” in Advances in Neural Information Processing Systems, 2017, pp. 2266–2276. [ Google Scholar ]
[242] Cisse M, Bojanowski P, Grave E, Dauphin Y, and Usunier N, “ Parseval networks: Improving robustness to adversarial examples ,” in International Conference on Machine Learning, 2017, pp. 854–863. [ Google Scholar ]
[243] Dhillon GS et al., “ Stochastic activation pruning for robust adversarial defense ,” arXiv preprint arXiv:1803.01442 , 2018.
[244] Srisakaokul S, Zhong Z, Zhang Y, Yang W, and Xie T, “ Muldef: Multi-model-based defense against adversarial examples for neural networks ,” arXiv preprint arXiv:1809.00065 , 2018.
[245] Papernot N and McDaniel P, “ Deep k-nearest neighbors: Towards confident, interpretable and robust deep learning ,” arXiv preprint arXiv:1803.04765 , 2018.
[246] Grosse K, Manoharan P, Papernot N, Backes M, and McDaniel P, “ On the (statistical) detection of adversarial examples ,” arXiv preprint arXiv:1702.06280 , 2017.
[247] Lu J, Issaranon T, and Forsyth D, “ Safetynet: Detecting and rejecting adversarial examples robustly ,” CoRR, abs/1704.00103 , 2017.
[248] Gebhart T and Schrater P, “ Adversary detection in neural networks via persistent homology ,” arXiv preprint arXiv:1711.10056 , 2017.
[249] Yang Z, Li B, Chen P-Y, and Song D, “ Characterizing audio adversarial examples using temporal dependency ,” arXiv preprint arXiv:1809.10875 , 2018.
[250] Ma X et al., “ Characterizing adversarial subspaces using local intrinsic dimensionality ,” arXiv preprint arXiv:1801.02613 , 2018.
[251] Xiao C et al., “ Characterizing adversarial examples based on spatial consistency information for semantic segmentation ,” in Proceedings of the European Conference on Computer Vision, 2018, pp. 217–234. [ Google Scholar ]
[252] Li X and Li F, “ Adversarial examples detection in deep networks with convolutional filter statistics ,” CoRR, abs/1612.07767 , vol. 7 , 2016. [ Google Scholar ]
[253] Carrara F et al., “ Detecting adversarial example attacks to deep neural networks ,” in Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing. ACM, 2017, p. 38. [ Google Scholar ]
[254] Feinman R, Curtin RR, Shintre S, and Gardner AB, “ Detecting adversarial samples from artifacts ,” arXiv preprint arXiv:1703.00410 , 2017.
[255] Liang B et al., “ Detecting adversarial examples in deep networks with adaptive noise reduction ,” arXiv preprint arXiv:1705.08378 , 2017.
[256] Hendrycks D and Gimpel K, “ Early methods for detecting adversarial images ,” ICLR Workshop track, 2017. [ Google Scholar ]
[257] Metzen JH, Genewein T, Fischer V, and Bischoff B, “ On detecting adversarial perturbations ,” arXiv preprint arXiv:1702.04267 , 2017.
[258] Akhtar N, Liu J, and Mian A, “ Defense against universal adversarial perturbations ,” arXiv preprint arXiv:1711.05929 , 2017.
[259] Chen X et al., “ Infogan: Interpretable representation learning by information maximizing generative adversarial nets ,” in Advances in neural information processing systems, 2016, pp. 2172–2180. [ Google Scholar ]
[260] Hayes J and Ohrimenko O, “ Contamination attacks and mitigation in multi-party machine learning ,” in Advances in Neural Information Processing Systems, 2018, pp. 6604–6615. [ Google Scholar ]
[261] Jagielski M et al., “ Manipulating machine learning: Poisoning attacks and countermeasures for regression learning ,” in 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 2018, pp. 19–35. [ Google Scholar ]
[262] Xiao H, Xiao H, and Eckert C, “ Adversarial label flips attack on support vector machines ,” in Proceedings of the 20th European Conference on Artificial Intelligence. IOS Press, 2012, pp. 870–875. [ Google Scholar ]
[263] Zhao M, An B, Gao W, and Zhang T, “ Efficient label contamination attacks against black-box learning models .” IJCAI , 2017.
[264] Biggio B, Nelson B, and Laskov P, “ Poisoning attacks against support vector machines ,” arXiv preprint arXiv:1206.6389 , 2012.
[265] Biggio B, Fumera G, and Roli F, “ Pattern recognition systems under attack: Design issues and research challenges ,” Intl. Journal of Pattern Recognition and Artificial Intelligence , vol. 28 , no. 07 , 2014. [ Google Scholar ]
[266] Raghunathan A, Steinhardt J, and Liang P, “ Certified defenses against adversarial examples ,” arXiv preprint arXiv:1801.09344 , 2018.
[267] Cao Y and Yang J, “ Towards making systems forget with machine unlearning ,” in S&P , 2015, pp. 463–80.
[268] He W, Wei J, Chen X, Carlini N, and Song D, “ Adversarial example defenses: Ensembles of weak defenses are not strong ,” arXiv preprint arXiv:1706.04701 , 2017.
[269] Bastani O et al., “ Measuring neural net robustness with constraints ,” in Advances in neural information processing systems, 2016, pp. 2613–2621. [ Google Scholar ]
[270] Katzir Z and Elovici Y, “ Quantifying the resilience of machine learning classifiers used for cyber security ,” Expert Systems with Applications , vol. 92 , pp. 419–429, 2018. [ Google Scholar ]
[271] Kumar N, Berg AC, Belhumeur PN, and Nayar SK, “ Attribute and simile classifiers for face verification ,” in Computer Vision, 2009 IEEE 12th International Conference on. IEEE, 2009, pp. 365–372. [ Google Scholar ]
[272] Rozsa A, Günther M, and Boult TE, “ Are accuracy and robustness correlated ,” in 2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 2016, pp. 227–232. [ Google Scholar ]
[273] Fawzi A, Moosavi-Dezfooli S-M, and Frossard P, “ The robustness of deep networks: A geometrical perspective ,” IEEE Signal Processing Magazine , vol. 34 , no. 6 , pp. 50–62, 2017. [ Google Scholar ]
[274] Behzadan V and Munir A, “ Vulnerability of deep reinforcement learning to policy induction attacks ,” in International Conference on Machine Learning and Data Mining in Pattern Recognition. Springer, 2017, pp. 262–275. [ Google Scholar ]
[275] Yau K-LA, Qadir J, Khoo HL, Ling MH, and Komisarczuk P, “ A survey on reinforcement learning models and algorithms for traffic signal control ,” ACM Computing Surveys , vol. 50 , no. 3 , p. 34, 2017. [ Google Scholar ]
[276] Huang X, Kwiatkowska M, Wang S, and Wu M, “ Safety verification of deep neural networks ,” in International Conference on Computer Aided Verification. Springer, 2017, pp. 3–29. [ Google Scholar ]
[277] Pei K, Cao Y, Yang J, and Jana S, “ Deepxplore: Automated whitebox testing of deep learning systems ,” in proceedings of the 26th Symposium on Operating Systems Principles. ACM, 2017, pp. 1–18. [ Google Scholar ]
[278] Matthew P and Anderson M, “ Developing coercion detection solutions for biometrie security ,” in SAI Computing Conf. IEEE, 2016, pp. 1123–30. [ Google Scholar ]
[279] Parthasaradhi ST, Derakhshani R, Hornak LA, and Schuckers SA, “ Time-series detection of perspiration as a liveness test in fingerprint devices ,” IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews) , vol. 35 , no. 3 , pp. 335–343, 2005. [ Google Scholar ]