Multi-agent reinforcement learning for cost-aware collaborative task execution in energy-harvesting D2D networks


In device-to-device (D2D) networks, multiple resource-limited mobile devices cooperate with one another to execute computation tasks. As the battery capacity of mobile devices is limited, the computation tasks running on the mobile devices will terminate once the battery is dead. In order to achieve sustainable computation, energy-harvesting technology has been introduced into D2D networks. At present, how to make multiple energy harvesting mobile devices work collaboratively to minimize the long-term system cost for task execution under limited computing, network and battery capacity constraint is a challenging issue. To deal with such a challenge, in this paper, we design a multi-agent deep deterministic policy gradient (MADDPG) based cost-aware collaborative task-execution (CACTE) scheme in energy harvesting D2D (EH-D2D) networks. To validate the CACTE scheme's performance, we conducted extensive experiments to compare the CACTE scheme with four baseline algorithms, including Local, Random, ECLB (Energy Capacity Load Balance) and CCLB (Computing Capacity Load Balance). Experiments were accompanied by various system parameters, such as the mobile device's battery capacity, task workload, the bandwidth and so on. The experimental results show that the CACTE scheme can make multiple mobile devices cooperate effectively with one another to execute many more tasks and achieve a higher long-term reward, including lower task latency and fewer dropped tasks.

Publication DOI:
Divisions: College of Business and Social Sciences > Aston Business School
College of Business and Social Sciences > Aston Business School > Operations & Information Management
Additional Information: © 2021, Elsevier. Licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Funding Information: This work was supported by the National Science Foundation of China (No. 61802095, 61572162, 61572251), the Zhejiang Provincial National Science Foundation of China (No. LQ19F020011, LQ17F020003), the Zhejiang Provincial Key Science and Technology Project Foundation (NO. 2018C01012), and the Open Foundation of State Key Laboratory of Networking and Switching Technology (Beijing University of Posts and Telecommunications) (No. SKLNST-2019-2-15) and VC Research (VCR 0000111).
Uncontrolled Keywords: collaborative task execution,cost-aware,D2D networks,multi-agent deep deterministic policy gradient,partially observable Markov decision process,Computer Networks and Communications
Publication ISSN: 1389-1286
Full Text Link:
Related URLs: http://www.scop ... tnerID=8YFLogxK (Scopus URL)
https://www.sci ... 2334?via%3Dihub (Publisher URL)
PURE Output Type: Article
Published Date: 2021-08-04
Published Online Date: 2021-05-29
Accepted Date: 2021-05-15
Authors: Huang, Binbin
Liu, Xiao
Wang, Shangguang
Pan, Linxuan
Chang, Victor (ORCID Profile 0000-0002-8012-5852)

Export / Share Citation


Additional statistics for this record