Multi-camera Torso Pose Estimation using Graph Neural Networks


Estimating the location and orientation of humans is an essential skill for service and assistive robots. To achieve a reliable estimation in a wide area such as an apartment, multiple RGBD cameras are frequently used. Firstly, these setups are relatively expensive. Secondly, they seldom perform an effective data fusion using the multiple camera sources at an early stage of the processing pipeline. Occlusions and partial views make this second point very relevant in these scenarios. The proposal presented in this paper makes use of graph neural networks to merge the information acquired from multiple camera sources, achieving a mean absolute error below 125 mm for the location and 10 degrees for the orientation using low-resolution RGB images. The experiments, conducted in an apartment with three cameras, benchmarked two different graph neural network implementations and a third architecture based on fully connected layers. The software used has been released as open-source in a public repository.

Publication DOI:
Divisions: ?? 50811700Jl ??
College of Engineering & Physical Sciences
Additional Information: CC BY-SA © 2020 The Authors
Event Title: IEEE International Conference on Robot & Human Interactive Communication
Event Type: Other
Event Dates: 2020-08-31 - 2020-09-04
Uncontrolled Keywords: human tracking,graph neural networks,sensorised environments,Artificial Intelligence,Communication,Social Psychology,Human-Computer Interaction
ISBN: 978-1-7281-6076-4, 978-1-7281-6075-7
Last Modified: 27 Jun 2024 12:33
Date Deposited: 14 Sep 2020 10:07
Full Text Link:
Related URLs: http://www.scop ... tnerID=8YFLogxK (Scopus URL)
PURE Output Type: Conference contribution
Published Date: 2020-10-14
Accepted Date: 2020-06-27
Authors: Rodriguez-Criado, Daniel
Bachiller, Pilar
Bustos, Pablo
Vogiatzis, George (ORCID Profile 0000-0002-3226-0603)
Manso, Luis J. (ORCID Profile 0000-0003-2616-1120)


Export / Share Citation


Additional statistics for this record