Multi-agent reinforcement learning for cost-aware collaborative task execution in energy-harvesting D2D networks

Huang, Binbin, Liu, Xiao, Wang, Shangguang, Pan, Linxuan and Chang, Victor (2021). Multi-agent reinforcement learning for cost-aware collaborative task execution in energy-harvesting D2D networks. Computer Networks, 195 ,

Abstract

In device-to-device (D2D) networks, multiple resource-limited mobile devices cooperate with one another to execute computation tasks. As the battery capacity of mobile devices is limited, the computation tasks running on the mobile devices will terminate once the battery is dead. In order to achieve sustainable computation, energy-harvesting technology has been introduced into D2D networks. At present, how to make multiple energy harvesting mobile devices work collaboratively to minimize the long-term system cost for task execution under limited computing, network and battery capacity constraint is a challenging issue. To deal with such a challenge, in this paper, we design a multi-agent deep deterministic policy gradient (MADDPG) based cost-aware collaborative task-execution (CACTE) scheme in energy harvesting D2D (EH-D2D) networks. To validate the CACTE scheme's performance, we conducted extensive experiments to compare the CACTE scheme with four baseline algorithms, including Local, Random, ECLB (Energy Capacity Load Balance) and CCLB (Computing Capacity Load Balance). Experiments were accompanied by various system parameters, such as the mobile device's battery capacity, task workload, the bandwidth and so on. The experimental results show that the CACTE scheme can make multiple mobile devices cooperate effectively with one another to execute many more tasks and achieve a higher long-term reward, including lower task latency and fewer dropped tasks.

Publication DOI:	https://doi.org/10.1016/j.comnet.2021.108176
Divisions:	College of Business and Social Sciences > Aston Business School College of Business and Social Sciences > Aston Business School > Operations & Information Management
Funding Information:	This work was supported by the National Science Foundation of China (No. 61802095, 61572162, 61572251), the Zhejiang Provincial National Science Foundation of China (No. LQ19F020011, LQ17F020003), the Zhejiang Provincial Key Science and Technology Project
Additional Information:	© 2021, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ Funding Information: This work was supported by the National Science Foundation of China (No. 61802095, 61572162, 61572251), the Zhejiang Provincial National Science Foundation of China (No. LQ19F020011, LQ17F020003), the Zhejiang Provincial Key Science and Technology Project Foundation (NO. 2018C01012), and the Open Foundation of State Key Laboratory of Networking and Switching Technology (Beijing University of Posts and Telecommunications) (No. SKLNST-2019-2-15) and VC Research (VCR 0000111).
Uncontrolled Keywords:	collaborative task execution,cost-aware,D2D networks,multi-agent deep deterministic policy gradient,partially observable Markov decision process,Computer Networks and Communications
Publication ISSN:	1389-1286
Last Modified:	07 Jan 2026 08:23
Date Deposited:	09 Jun 2022 10:43
Full Text Link:
Related URLs:	http://www.scop ... tnerID=8YFLogxK (Scopus URL) https://www.sci ... 2334?via%3Dihub (Publisher URL)
PURE Output Type:	Article
Published Date:	2021-08-04
Published Online Date:	2021-05-29
Accepted Date:	2021-05-15
Authors:	Huang, Binbin Liu, Xiao Wang, Shangguang Pan, Linxuan Chang, Victor ( 0000-0002-8012-5852)

Download

Version: Accepted Version

License: Creative Commons Attribution Non-commercial No Derivatives

| Preview

Export / Share Citation

Explore Further

Statistics

Additional statistics for this record

Record administration