Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

Bi, Suzhi; Huang, Liang; Wang, Hui; Zhang, Ying-Jun Angela

Computer Science > Networking and Internet Architecture

arXiv:2010.01370 (cs)

[Submitted on 3 Oct 2020 (v1), last revised 7 Jul 2021 (this version, v3)]

Title:Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

Authors:Suzhi Bi, Liang Huang, Hui Wang, Ying-Jun Angela Zhang

View PDF

Abstract:Opportunistic computation offloading is an effective method to improve the computation performance of mobile-edge computing (MEC) networks under dynamic edge environment. In this paper, we consider a multi-user MEC network with time-varying wireless channels and stochastic user task data arrivals in sequential time frames. In particular, we aim to design an online computation offloading algorithm to maximize the network data processing capability subject to the long-term data queue stability and average power constraints. The online algorithm is practical in the sense that the decisions for each time frame are made without the assumption of knowing future channel conditions and data arrivals. We formulate the problem as a multi-stage stochastic mixed integer non-linear programming (MINLP) problem that jointly determines the binary offloading (each user computes the task either locally or at the edge server) and system resource allocation decisions in sequential time frames. To address the coupling in the decisions of different time frames, we propose a novel framework, named LyDROO, that combines the advantages of Lyapunov optimization and deep reinforcement learning (DRL). Specifically, LyDROO first applies Lyapunov optimization to decouple the multi-stage stochastic MINLP into deterministic per-frame MINLP subproblems. By doing so, it guarantees to satisfy all the long-term constraints by solving the per-frame subproblems that are much smaller in size. Then, LyDROO integrates model-based optimization and model-free DRL to solve the per-frame MINLP problems with low computational complexity. Simulation results show that under various network setups, the proposed LyDROO achieves optimal computation performance while stabilizing all queues in the system. Besides, it induces very low execution latency that is particularly suitable for real-time implementation in fast fading environments.

Comments:	The paper has been accepted for publication by IEEE Trans. Wireless Communications, the source codes associated with the paper are available at this https URL. arXiv admin note: text overlap with arXiv:2102.03286
Subjects:	Networking and Internet Architecture (cs.NI)
Cite as:	arXiv:2010.01370 [cs.NI]
	(or arXiv:2010.01370v3 [cs.NI] for this version)
	https://doi.org/10.48550/arXiv.2010.01370

Submission history

From: Suzhi Bi [view email]
[v1] Sat, 3 Oct 2020 14:49:55 UTC (1,517 KB)
[v2] Mon, 1 Mar 2021 17:01:36 UTC (2,135 KB)
[v3] Wed, 7 Jul 2021 01:58:34 UTC (2,137 KB)

Computer Science > Networking and Internet Architecture

Title:Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Networking and Internet Architecture

Title:Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators