You Only Need a Good Embeddings Extractor to Fix Spurious Correlations

Mehta, Raghav; Albiero, Vítor; Chen, Li; Evtimov, Ivan; Glaser, Tamar; Li, Zhiheng; Hassner, Tal

Computer Science > Computer Vision and Pattern Recognition

arXiv:2212.06254 (cs)

[Submitted on 12 Dec 2022]

Title:You Only Need a Good Embeddings Extractor to Fix Spurious Correlations

Authors:Raghav Mehta, Vítor Albiero, Li Chen, Ivan Evtimov, Tamar Glaser, Zhiheng Li, Tal Hassner

View PDF

Abstract:Spurious correlations in training data often lead to robustness issues since models learn to use them as shortcuts. For example, when predicting whether an object is a cow, a model might learn to rely on its green background, so it would do poorly on a cow on a sandy background. A standard dataset for measuring state-of-the-art on methods mitigating this problem is Waterbirds. The best method (Group Distributionally Robust Optimization - GroupDRO) currently achieves 89\% worst group accuracy and standard training from scratch on raw images only gets 72\%. GroupDRO requires training a model in an end-to-end manner with subgroup labels. In this paper, we show that we can achieve up to 90\% accuracy without using any sub-group information in the training set by simply using embeddings from a large pre-trained vision model extractor and training a linear classifier on top of it. With experiments on a wide range of pre-trained models and pre-training datasets, we show that the capacity of the pre-training model and the size of the pre-training dataset matters. Our experiments reveal that high capacity vision transformers perform better compared to high capacity convolutional neural networks, and larger pre-training dataset leads to better worst-group accuracy on the spurious correlation dataset.

Comments:	Accepted at ECCV 2022 workshop on Responsible Computer Vision (RCV)
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2212.06254 [cs.CV]
	(or arXiv:2212.06254v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2212.06254

Submission history

From: Raghav Mehta [view email]
[v1] Mon, 12 Dec 2022 21:42:33 UTC (406 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:You Only Need a Good Embeddings Extractor to Fix Spurious Correlations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:You Only Need a Good Embeddings Extractor to Fix Spurious Correlations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators