Enhancing Robot Social Navigation with Reinforcement Learning and Advanced Predictive Models: Cosine-Gated-LSTM and Adaptive Predictive Horizons

Abstract

This thesis presents a comprehensive exploration of Social Robot Navigation (SocNav) in human-centric environments, a field of growing importance as robots become integral to sectors such as healthcare, hospitality, and public service. The research focuses on the integration of Reinforcement Learning (RL) with advanced predictive models to improve the navigation and interaction capabilities of robots in social environments. A significant contribution of this work is the development and integration of our novel predictive world models into RL frameworks. These models improve the agent’s ability to predict future states, thereby improving decision-making efficiency and adaptability in a dynamic social environment. However, the initial implementation of fixed prediction horizons, such as always predicting two steps ahead in the 2StepAhead model, revealed limitations in flexibility and computational efficiency. Addressing this, we introduced an entropy-driven adaptive prediction horizon mechanism that dynamically adjusts the prediction horizon based on real-time policy entropy, balancing computational resources with the need for long-term future state prediction. An important method in this thesis is the introduction of the Cosine-Gated Long Short-Term Memory (CGLSTM) model. By integrating a cosine similarity-based gating mechanism with vanilla LSTM (Long Short-Term Memory) networks, CGLSTM significantly advances sequence prediction capabilities. The model consistently outperformed vanilla LSTM, GRU (Gated Recurrent Units), and RAU (Recurrent Attention Unit) models, achieving up to a 30% reduction in Mean Absolute Error (MAE) in environments such as FallingBallEnv and SocNavGym. Furthermore, integrating CGLSTM into DreamerV a state-of-the-art model-based reinforcement learning framework that learns a latent world model and plans actions through imagination resulted in an approximately 5% increase in cumulative reward, demonstrating that stronger predictive sequence models can directly enhance RL performance. The thesis also addresses the computational challenges associated with predictive models in varying environmental complexities. The entropy adaptive prediction horizon mechanism effectively mitigates the computational challenges by adjusting the prediction horizon in response to environmental uncertainty, leading to a 15% improvement in success rates in high-entropy scenarios while maintaining computational efficiency with only a 2% increase in inference time in low-entropy situations. Overall, this thesis significantly contributes to the advancement of SocNav and predictive modeling within RL, laying the groundwork for future research aimed at integrating robots more intuitively into our society. The developed models improve robots’ ability to navigate complex environments with improved predictive models and computational efficiency, paving the way for seamless integration into various sectors.

Publication DOI: https://doi.org/10.48780/publications.aston.ac.uk.00048417
Divisions: College of Engineering & Physical Sciences > School of Computer Science and Digital Technologies
Additional Information: Copyright © Dirichukwu Goodluck Oguzie, 2024. Dirichukwu Goodluck Oguzie asserts their moral right to be identified as the author of this thesis. This copy of the thesis has been supplied on condition that anyone who consults it is understood to recognise that its copyright rests with its author and that no quotation from the thesis and no information derived from it may be published without appropriate permission or acknowledgement. If you have discovered material in Aston Publications Explorer which is unlawful e.g. breaches copyright, (either yours or that of a third party) or any other law, including but not limited to those relating to patent, trademark, confidentiality, data protection, obscenity, defamation, libel, then please read our Takedown Policy and contact the service immediately.
Institution: Aston University
Uncontrolled Keywords: Social Robot Navigation,Reinforcement Learning,Predictive World Models,Cosine-Gated LSTM,Adaptive Horizon,DreamerV3,Sequence Prediction,Computational Efficiency,Human-Robot Interaction
Last Modified: 27 Nov 2025 13:12
Date Deposited: 27 Nov 2025 13:10
Completed Date: 2024-12
Authors: Oguzie, Dirichukwu Goodluck

Export / Share Citation


Statistics

Additional statistics for this record