Deep Reinforcement Learning and Transfer Learning of Robot In-hand Dexterous Manipulation

Abstract

In recent years, deep reinforcement learning (RL) and imitation learning (IL) have shown remarkable success in many robotics areas. However, the domain of in-hand dexterous manipulation remains challenging for RL and IL. Achieving proficiency in these tasks often requires millions of attempts or demonstrations before a stable strategy is learnt. Consequently, improving the learning speed and efficiency becomes paramount for RL and IL to be practically used in real-world in-hand dexterous manipulation tasks. This thesis primarily addressed multi-goal robot in-hand dexterous manipulation tasks, with various methods proposed to improve learning efficiency: For RL, (1) the Goal Density-based Hindsight Experience Prioritisation (GDP) is proposed to improve learning efficiency by prioritising some experiences during the replay stage; Furthermore, (2) another method called Policy-levelbased Curriculum Goal Selection (PL-CGS) is proposed to automatically generate goals during the learning process that could form a curriculum learning process; For IL, (3) the Goal-based Self-Adaptive Generative Adversarial Imitation Learning (Goal-SGAIL) incorporates a self-adaptive mechanism into the GAIL framework that applies to multi-goal learning scenarios. Extensive experiments were conducted in simulation with OpenAI Gym, focusing on robot manipulation tasks, to compare the proposed methods against existing RL and IL approaches. GDP and PL-CGS showed faster learning speed compared with the vanilla DDPG+HER method for some of the tasks in the RL experiments. For experiments in IL that involve sub-optimal demonstrations, especially those with highly sub-optimal demonstrations from human teleoperation, Goal-SGAIL showed its ability to overcome the demonstrations’ sub-optimality and outperformed DDPGfD+HER and Goal-GAIL for some challenging in-hand manipulation tasks.

Publication DOI: https://doi.org/10.48780/publications.aston.ac.uk.00046690
Additional Information: Copyright © Yingyi Kuang, 2023. Yingyi Kuang asserts her moral right to be identified as the author of this thesis. This copy of the thesis has been supplied on condition that anyone who consults it is understood to recognise that its copyright rests with its author and that no quotation from the thesis and no information derived from it may be published without appropriate permission or acknowledgement. If you have discovered material in Aston Publications Explorer which is unlawful e.g. breaches copyright, (either yours or that of a third party) or any other law, including but not limited to those relating to patent, trademark, confidentiality, data protection, obscenity, defamation, libel, then please read our Takedown Policy and contact the service immediately.
Institution: Aston University
Uncontrolled Keywords: Reinforcement learning,HER,Experience prioritisation,Curriculum learning,Learning from demonstration,GAIL
Last Modified: 30 Sep 2024 08:39
Date Deposited: 23 Sep 2024 14:20
Completed Date: 2024-06-05
Authors: Kuang, Yingyi

Export / Share Citation


Statistics

Additional statistics for this record