Generalization Bounds for Heavy-Tailed SDEs through the Fractional Fokker-Planck Equation

Dupuis, Benjamin; Şimşekli, Umut

Statistics > Machine Learning

arXiv:2402.07723 (stat)

[Submitted on 12 Feb 2024 (v1), last revised 3 Jun 2024 (this version, v2)]

Title:Generalization Bounds for Heavy-Tailed SDEs through the Fractional Fokker-Planck Equation

Authors:Benjamin Dupuis, Umut Şimşekli

View PDF HTML (experimental)

Abstract:Understanding the generalization properties of heavy-tailed stochastic optimization algorithms has attracted increasing attention over the past years. While illuminating interesting aspects of stochastic optimizers by using heavy-tailed stochastic differential equations as proxies, prior works either provided expected generalization bounds, or introduced non-computable information theoretic terms. Addressing these drawbacks, in this work, we prove high-probability generalization bounds for heavy-tailed SDEs which do not contain any nontrivial information theoretic terms. To achieve this goal, we develop new proof techniques based on estimating the entropy flows associated with the so-called fractional Fokker-Planck equation (a partial differential equation that governs the evolution of the distribution of the corresponding heavy-tailed SDE). In addition to obtaining high-probability bounds, we show that our bounds have a better dependence on the dimension of parameters as compared to prior art. Our results further identify a phase transition phenomenon, which suggests that heavy tails can be either beneficial or harmful depending on the problem structure. We support our theory with experiments conducted in a variety of settings.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2402.07723 [stat.ML]
	(or arXiv:2402.07723v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.2402.07723

Submission history

From: Benjamin Dupuis [view email]
[v1] Mon, 12 Feb 2024 15:35:32 UTC (1,770 KB)
[v2] Mon, 3 Jun 2024 14:20:34 UTC (1,878 KB)

Statistics > Machine Learning

Title:Generalization Bounds for Heavy-Tailed SDEs through the Fractional Fokker-Planck Equation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Generalization Bounds for Heavy-Tailed SDEs through the Fractional Fokker-Planck Equation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators