Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs

Zhang, Qingyang; Yang, Yiming; Ruan, Jingqing; Xiong, Xuantang; Xing, Dengpeng; Xu, Bo

Computer Science > Machine Learning

arXiv:2307.12063 (cs)

[Submitted on 22 Jul 2023]

Title:Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs

Authors:Qingyang Zhang, Yiming Yang, Jingqing Ruan, Xuantang Xiong, Dengpeng Xing, Bo Xu

View PDF

Abstract:Goal-Conditioned Hierarchical Reinforcement Learning (GCHRL) is a promising paradigm to address the exploration-exploitation dilemma in reinforcement learning. It decomposes the source task into subgoal conditional subtasks and conducts exploration and exploitation in the subgoal space. The effectiveness of GCHRL heavily relies on subgoal representation functions and subgoal selection strategy. However, existing works often overlook the temporal coherence in GCHRL when learning latent subgoal representations and lack an efficient subgoal selection strategy that balances exploration and exploitation. This paper proposes HIerarchical reinforcement learning via dynamically building Latent Landmark graphs (HILL) to overcome these limitations. HILL learns latent subgoal representations that satisfy temporal coherence using a contrastive representation learning objective. Based on these representations, HILL dynamically builds latent landmark graphs and employs a novelty measure on nodes and a utility measure on edges. Finally, HILL develops a subgoal selection strategy that balances exploration and exploitation by jointly considering both measures. Experimental results demonstrate that HILL outperforms state-of-the-art baselines on continuous control tasks with sparse rewards in sample efficiency and asymptotic performance. Our code is available at this https URL.

Comments:	Accepted by the conference of International Joint Conference on Neural Networks (IJCNN) 2023
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2307.12063 [cs.LG]
	(or arXiv:2307.12063v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2307.12063

Submission history

From: Zhang Qingyang [view email]
[v1] Sat, 22 Jul 2023 12:10:23 UTC (3,148 KB)

Computer Science > Machine Learning

Title:Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators