A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples

Meng, Zhao; Wattenhofer, Roger

Computer Science > Computation and Language

arXiv:2010.01345 (cs)

[Submitted on 3 Oct 2020]

Title:A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples

Authors:Zhao Meng, Roger Wattenhofer

View PDF

Abstract:Generating adversarial examples for natural language is hard, as natural language consists of discrete symbols, and examples are often of variable lengths. In this paper, we propose a geometry-inspired attack for generating natural language adversarial examples. Our attack generates adversarial examples by iteratively approximating the decision boundary of Deep Neural Networks (DNNs). Experiments on two datasets with two different models show that our attack fools natural language models with high success rates, while only replacing a few words. Human evaluation shows that adversarial examples generated by our attack are hard for humans to recognize. Further experiments show that adversarial training can improve model robustness against our attack.

Comments:	COLING 2020 Long Paper
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2010.01345 [cs.CL]
	(or arXiv:2010.01345v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2010.01345

Submission history

From: Zhao Meng [view email]
[v1] Sat, 3 Oct 2020 12:58:47 UTC (53 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2020-10

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Zhao Meng
Roger Wattenhofer

export BibTeX citation

Computer Science > Computation and Language

Title:A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators