Detecting Adversarial Samples for Deep Neural Networks through Mutation Testing

Wang, Jingyi; Sun, Jun; Zhang, Peixin; Wang, Xinyu

Computer Science > Machine Learning

arXiv:1805.05010 (cs)

[Submitted on 14 May 2018 (v1), last revised 17 May 2018 (this version, v2)]

Title:Detecting Adversarial Samples for Deep Neural Networks through Mutation Testing

Authors:Jingyi Wang, Jun Sun, Peixin Zhang, Xinyu Wang

View PDF

Abstract:Recently, it has been shown that deep neural networks (DNN) are subject to attacks through adversarial samples. Adversarial samples are often crafted through adversarial perturbation, i.e., manipulating the original sample with minor modifications so that the DNN model labels the sample incorrectly. Given that it is almost impossible to train perfect DNN, adversarial samples are shown to be easy to generate. As DNN are increasingly used in safety-critical systems like autonomous cars, it is crucial to develop techniques for defending such attacks. Existing defense mechanisms which aim to make adversarial perturbation challenging have been shown to be ineffective. In this work, we propose an alternative approach. We first observe that adversarial samples are much more sensitive to perturbations than normal samples. That is, if we impose random perturbations on a normal and an adversarial sample respectively, there is a significant difference between the ratio of label change due to the perturbations. Observing this, we design a statistical adversary detection algorithm called nMutant (inspired by mutation testing from software engineering community). Our experiments show that nMutant effectively detects most of the adversarial samples generated by recently proposed attacking methods. Furthermore, we provide an error bound with certain statistical significance along with the detection.

Comments:	Sumitted to NIPS 2018
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1805.05010 [cs.LG]
	(or arXiv:1805.05010v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1805.05010

Submission history

From: Jingyi Wang Ph.D. [view email]
[v1] Mon, 14 May 2018 04:48:24 UTC (453 KB)
[v2] Thu, 17 May 2018 08:38:04 UTC (467 KB)

Computer Science > Machine Learning

Title:Detecting Adversarial Samples for Deep Neural Networks through Mutation Testing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Detecting Adversarial Samples for Deep Neural Networks through Mutation Testing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators