Masked Language Model Scoring

Salazar, Julian; Liang, Davis; Nguyen, Toan Q.; Kirchhoff, Katrin

doi:10.18653/v1/2020.acl-main.240

Computer Science > Computation and Language

arXiv:1910.14659 (cs)

[Submitted on 31 Oct 2019 (v1), last revised 1 Jan 2021 (this version, v3)]

Title:Masked Language Model Scoring

Authors:Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff

View PDF

Abstract:Pretrained masked language models (MLMs) require finetuning for most NLP tasks. Instead, we evaluate MLMs out of the box via their pseudo-log-likelihood scores (PLLs), which are computed by masking tokens one by one. We show that PLLs outperform scores from autoregressive language models like GPT-2 in a variety of tasks. By rescoring ASR and NMT hypotheses, RoBERTa reduces an end-to-end LibriSpeech model's WER by 30% relative and adds up to +1.7 BLEU on state-of-the-art baselines for low-resource translation pairs, with further gains from domain adaptation. We attribute this success to PLL's unsupervised expression of linguistic acceptability without a left-to-right bias, greatly improving on scores from GPT-2 (+10 points on island effects, NPI licensing in BLiMP). One can finetune MLMs to give scores without masking, enabling computation in a single inference pass. In all, PLLs and their associated pseudo-perplexities (PPPLs) enable plug-and-play use of the growing number of pretrained MLMs; e.g., we use a single cross-lingual model to rescore translations in multiple languages. We release our library for language model scoring at this https URL.

Comments:	ACL 2020 camera-ready (presented July 2020)
Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS); Machine Learning (stat.ML)
Cite as:	arXiv:1910.14659 [cs.CL]
	(or arXiv:1910.14659v3 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1910.14659
Journal reference:	Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020), 2699-2712
Related DOI:	https://doi.org/10.18653/v1/2020.acl-main.240

Submission history

From: Julian Salazar [view email]
[v1] Thu, 31 Oct 2019 17:51:21 UTC (129 KB)
[v2] Thu, 14 May 2020 17:55:10 UTC (128 KB)
[v3] Fri, 1 Jan 2021 00:00:14 UTC (129 KB)

Computer Science > Computation and Language

Title:Masked Language Model Scoring

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Masked Language Model Scoring

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators