With Little Power Comes Great Responsibility

Card, Dallas; Henderson, Peter; Khandelwal, Urvashi; Jia, Robin; Mahowald, Kyle; Jurafsky, Dan

Computer Science > Computation and Language

arXiv:2010.06595 (cs)

[Submitted on 13 Oct 2020]

Title:With Little Power Comes Great Responsibility

Authors:Dallas Card, Peter Henderson, Urvashi Khandelwal, Robin Jia, Kyle Mahowald, Dan Jurafsky

View PDF

Abstract:Despite its importance to experimental design, statistical power (the probability that, given a real effect, an experiment will reject the null hypothesis) has largely been ignored by the NLP community. Underpowered experiments make it more difficult to discern the difference between statistical noise and meaningful model improvements, and increase the chances of exaggerated findings. By meta-analyzing a set of existing NLP papers and datasets, we characterize typical power for a variety of settings and conclude that underpowered experiments are common in the NLP literature. In particular, for several tasks in the popular GLUE benchmark, small test sets mean that most attempted comparisons to state of the art models will not be adequately powered. Similarly, based on reasonable assumptions, we find that the most typical experimental design for human rating studies will be underpowered to detect small model differences, of the sort that are frequently studied. For machine translation, we find that typical test sets of 2000 sentences have approximately 75% power to detect differences of 1 BLEU point. To improve the situation going forward, we give an overview of best practices for power analysis in NLP and release a series of notebooks to assist with future power analyses.

Comments:	To appear at EMNLP 2020
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2010.06595 [cs.CL]
	(or arXiv:2010.06595v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2010.06595

Submission history

From: Dallas Card [view email]
[v1] Tue, 13 Oct 2020 18:00:02 UTC (482 KB)

Computer Science > Computation and Language

Title:With Little Power Comes Great Responsibility

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:With Little Power Comes Great Responsibility

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators