A cross-entropy based direct policy search algorithm for multi-objective energy storage control

Abstract

Effective control of Energy Storage Systems (ESS) is crucial for the secure and profitable operation of microgrids. In this context, ESSs are essential for enhancing the overall grid resilience, balancing supply, and mitigating voltage and frequency variations. This paper presents a novel neuroevolutionary method, coupling a modified version of the Multi-Objective Evolutionary Policy Search (MEPS) algorithm with the Cross-Entropy method, aimed at optimizing an ESS control problem. The modified MEPS, named Cascade-MEPS, employs a cascade weights mutation operator to refine policies by focusing on the most recent hidden node, ensuring localized and non-disruptive adjustments. The resulting algorithm, referred to as cross-entropy Cascade-MEPS (CE-CMEPS), utilizes the cross-entropy method as a depth initialization strategy, conducting an initial exploration of the weights space to initialize the population prior to Cascade-MEPS execution. Experimental validation on a newly proposed multi-objective ESS control problem demonstrates the efficacy of CE-CMEPS, showcasing performance improvements and reduced variation compared to standalone MEPS. Our results show that CE-CMEPS is an effective ESS discharge controller and a sustainable multi-objective reinforcement learning solution.

Publication DOI: https://doi.org/10.1007/s00521-025-11785-3
Divisions: College of Engineering & Physical Sciences > School of Computer Science and Digital Technologies > Applied AI & Robotics
College of Engineering & Physical Sciences > Aston Centre for Artifical Intelligence Research and Application
College of Engineering & Physical Sciences > School of Computer Science and Digital Technologies
College of Engineering & Physical Sciences
Aston University (General)
Funding Information: The Article Processing Charge (APC) for the publication of this research was funded by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) (ROR identifier: 00x0ma614).
Additional Information: Copyright © The Author(s) 2026. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/
Uncontrolled Keywords: Direct policy search (DPS),Reinforcement learning,Energy management,Multi-objective (MO) control,Neuroevolution,Neural networks architecture
Publication ISSN: 1433-3058
Last Modified: 09 Mar 2026 17:57
Date Deposited: 17 Feb 2026 15:48
Full Text Link:
Related URLs: https://link.sp ... 521-025-11785-3 (Publisher URL)
https://www.sco ... ns/105029986789 (Scopus URL)
PURE Output Type: Article
Published Date: 2026-02-13
Published Online Date: 2026-02-13
Accepted Date: 2025-10-09
Authors: Leite, Gabriel Matos Cardoso
Marcelino, Carolina Gil
Jiménez-Fernández, Silvia
Wanner, Elizabeth Fialho (ORCID Profile 0000-0001-6450-3043)
Salcedo-Sanz, Sancho
Pedreira, Carlos Eduardo

Download

[img]

Version: Published Version

License: Creative Commons Attribution


Export / Share Citation


Statistics

Additional statistics for this record