Learning the Target Network in Function Space

Asadi, Kavosh; Liu, Yao; Sabach, Shoham; Yin, Ming; Fakoor, Rasool

Computer Science > Machine Learning

arXiv:2406.01838 (cs)

[Submitted on 3 Jun 2024 (v1), last revised 23 Sep 2024 (this version, v2)]

Title:Learning the Target Network in Function Space

Authors:Kavosh Asadi, Yao Liu, Shoham Sabach, Ming Yin, Rasool Fakoor

View PDF HTML (experimental)

Abstract:We focus on the task of learning the value function in the reinforcement learning (RL) setting. This task is often solved by updating a pair of online and target networks while ensuring that the parameters of these two networks are equivalent. We propose Lookahead-Replicate (LR), a new value-function approximation algorithm that is agnostic to this parameter-space equivalence. Instead, the LR algorithm is designed to maintain an equivalence between the two networks in the function space. This value-based equivalence is obtained by employing a new target-network update. We show that LR leads to a convergent behavior in learning the value function. We also present empirical results demonstrating that LR-based target-network updates significantly improve deep RL on the Atari benchmark.

Comments:	Accepted to International Conference on Machine Learning (ICML24)
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2406.01838 [cs.LG]
	(or arXiv:2406.01838v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2406.01838

Submission history

From: Ming Yin [view email]
[v1] Mon, 3 Jun 2024 23:10:35 UTC (2,358 KB)
[v2] Mon, 23 Sep 2024 02:43:55 UTC (2,360 KB)

Computer Science > Machine Learning

Title:Learning the Target Network in Function Space

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Learning the Target Network in Function Space

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators