DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs

Dua, Dheeru; Wang, Yizhong; Dasigi, Pradeep; Stanovsky, Gabriel; Singh, Sameer; Gardner, Matt

Computer Science > Computation and Language

arXiv:1903.00161 (cs)

[Submitted on 1 Mar 2019 (v1), last revised 16 Apr 2019 (this version, v2)]

Title:DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs

Authors:Dheeru Dua, Yizhong Wang, Pradeep Dasigi, Gabriel Stanovsky, Sameer Singh, Matt Gardner

View PDF

Abstract:Reading comprehension has recently seen rapid progress, with systems matching humans on the most popular datasets for the task. However, a large body of work has highlighted the brittleness of these systems, showing that there is much work left to be done. We introduce a new English reading comprehension benchmark, DROP, which requires Discrete Reasoning Over the content of Paragraphs. In this crowdsourced, adversarially-created, 96k-question benchmark, a system must resolve references in a question, perhaps to multiple input positions, and perform discrete operations over them (such as addition, counting, or sorting). These operations require a much more comprehensive understanding of the content of paragraphs than what was necessary for prior datasets. We apply state-of-the-art methods from both the reading comprehension and semantic parsing literature on this dataset and show that the best systems only achieve 32.7% F1 on our generalized accuracy metric, while expert human performance is 96.0%. We additionally present a new model that combines reading comprehension methods with simple numerical reasoning to achieve 47.0% F1.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1903.00161 [cs.CL]
	(or arXiv:1903.00161v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1903.00161

Submission history

From: Dheeru Dua [view email]
[v1] Fri, 1 Mar 2019 05:32:01 UTC (2,543 KB)
[v2] Tue, 16 Apr 2019 21:22:39 UTC (3,145 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2019-03

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Dheeru Dua
Yizhong Wang
Pradeep Dasigi
Gabriel Stanovsky
Sameer Singh

…

export BibTeX citation

Computer Science > Computation and Language

Title:DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:DROP: A Reading Comprehension Benchmark Requiring Discrete Reasoning Over Paragraphs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators