Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting

Xie, Zeke; He, Fengxiang; Fu, Shaopeng; Sato, Issei; Tao, Dacheng; Sugiyama, Masashi

Computer Science > Machine Learning

arXiv:2011.06220 (cs)

[Submitted on 12 Nov 2020 (v1), last revised 10 May 2021 (this version, v3)]

Title:Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting

Authors:Zeke Xie, Fengxiang He, Shaopeng Fu, Issei Sato, Dacheng Tao, Masashi Sugiyama

View PDF

Abstract:Deep learning is often criticized by two serious issues which rarely exist in natural nervous systems: overfitting and catastrophic forgetting. It can even memorize randomly labelled data, which has little knowledge behind the instance-label pairs. When a deep network continually learns over time by accommodating new tasks, it usually quickly overwrites the knowledge learned from previous tasks. Referred to as the {\it neural variability}, it is well-known in neuroscience that human brain reactions exhibit substantial variability even in response to the same stimulus. This mechanism balances accuracy and plasticity/flexibility in the motor learning of natural nervous systems. Thus it motivates us to design a similar mechanism named {\it artificial neural variability} (ANV), which helps artificial neural networks learn some advantages from ``natural'' neural networks. We rigorously prove that ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. This result theoretically guarantees ANV a strictly improved generalizability, robustness to label noise, and robustness to catastrophic forgetting. We then devise a {\it neural variable risk minimization} (NVRM) framework and {\it neural variable optimizers} to achieve ANV for conventional network architectures in practice. The empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs. \footnote{Code: \url{this https URL}.

Comments:	Accepted by Neural Computation, MIT Press;20 pages; 13 figures; Key Words: Neural Variability, Neuroscience, Deep Learning, Label Noise, Catastrophic Forgetting
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2011.06220 [cs.LG]
	(or arXiv:2011.06220v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2011.06220

Submission history

From: Zeke Xie [view email]
[v1] Thu, 12 Nov 2020 06:06:33 UTC (1,099 KB)
[v2] Tue, 24 Nov 2020 05:01:19 UTC (1,226 KB)
[v3] Mon, 10 May 2021 12:44:20 UTC (1,254 KB)

Computer Science > Machine Learning

Title:Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Artificial Neural Variability for Deep Learning: On Overfitting, Noise Memorization, and Catastrophic Forgetting

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators