Average gradient outer product as a mechanism for deep neural collapse

Beaglehole, Daniel; Súkeník, Peter; Mondelli, Marco; Belkin, Mikhail

Computer Science > Machine Learning

arXiv:2402.13728v1 (cs)

[Submitted on 21 Feb 2024 (this version), latest version 17 Oct 2024 (v5)]

Title:Average gradient outer product as a mechanism for deep neural collapse

Authors:Daniel Beaglehole, Peter Súkeník, Marco Mondelli, Mikhail Belkin

View PDF HTML (experimental)

Abstract:Deep Neural Collapse (DNC) refers to the surprisingly rigid structure of the data representations in the final layers of Deep Neural Networks (DNNs). Though the phenomenon has been measured in a wide variety of settings, its emergence is only partially understood. In this work, we provide substantial evidence that DNC formation occurs primarily through deep feature learning with the average gradient outer product (AGOP). This takes a step further compared to efforts that explain neural collapse via feature-agnostic approaches, such as the unconstrained features model. We proceed by providing evidence that the right singular vectors and values of the weights are responsible for the majority of within-class variability collapse in DNNs. As shown in recent work, this singular structure is highly correlated with that of the AGOP. We then establish experimentally and theoretically that AGOP induces neural collapse in a randomly initialized neural network. In particular, we demonstrate that Deep Recursive Feature Machines, a method originally introduced as an abstraction for AGOP feature learning in convolutional neural networks, exhibits DNC.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2402.13728 [cs.LG]
	(or arXiv:2402.13728v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2402.13728

Submission history

From: Peter Súkeník [view email]
[v1] Wed, 21 Feb 2024 11:40:27 UTC (2,117 KB)
[v2] Thu, 23 May 2024 19:36:39 UTC (8,025 KB)
[v3] Fri, 4 Oct 2024 04:31:19 UTC (8,703 KB)
[v4] Mon, 7 Oct 2024 16:55:21 UTC (8,703 KB)
[v5] Thu, 17 Oct 2024 19:25:39 UTC (8,703 KB)

Computer Science > Machine Learning

Title:Average gradient outer product as a mechanism for deep neural collapse

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Average gradient outer product as a mechanism for deep neural collapse

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators