Mitigating Bias in Calibration Error Estimation

Roelofs, Rebecca; Cain, Nicholas; Shlens, Jonathon; Mozer, Michael C.

Computer Science > Machine Learning

arXiv:2012.08668 (cs)

[Submitted on 15 Dec 2020 (v1), last revised 11 Feb 2022 (this version, v3)]

Title:Mitigating Bias in Calibration Error Estimation

Authors:Rebecca Roelofs, Nicholas Cain, Jonathon Shlens, Michael C. Mozer

View PDF

Abstract:For an AI system to be reliable, the confidence it expresses in its decisions must match its accuracy. To assess the degree of match, examples are typically binned by confidence and the per-bin mean confidence and accuracy are compared. Most research in calibration focuses on techniques to reduce this empirical measure of calibration error, ECE_bin. We instead focus on assessing statistical bias in this empirical measure, and we identify better estimators. We propose a framework through which we can compute the bias of a particular estimator for an evaluation data set of a given size. The framework involves synthesizing model outputs that have the same statistics as common neural architectures on popular data sets. We find that binning-based estimators with bins of equal mass (number of instances) have lower bias than estimators with bins of equal width. Our results indicate two reliable calibration-error estimators: the debiased estimator (Brocker, 2012; Ferro and Fricker, 2012) and a method we propose, ECE_sweep, which uses equal-mass bins and chooses the number of bins to be as large as possible while preserving monotonicity in the calibration function. With these estimators, we observe improvements in the effectiveness of recalibration methods and in the detection of model miscalibration.

Comments:	To be published in AISTATS 2022. Code is available this https URL
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
Cite as:	arXiv:2012.08668 [cs.LG]
	(or arXiv:2012.08668v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2012.08668

Submission history

From: Rebecca Roelofs [view email]
[v1] Tue, 15 Dec 2020 23:28:06 UTC (798 KB)
[v2] Wed, 24 Feb 2021 19:25:00 UTC (1,491 KB)
[v3] Fri, 11 Feb 2022 00:15:27 UTC (3,818 KB)

Computer Science > Machine Learning

Title:Mitigating Bias in Calibration Error Estimation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Mitigating Bias in Calibration Error Estimation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators