Learning action recognition model from depth and skeleton videos

Computing and Communications

Associated organisational unit

DSI - Foundations

Electronic data

RahmaniandBennamoun_ICCV2017
Accepted author manuscript, 3.95 MB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Text available via DOI:

https://doi.org/10.1109/ICCV.2017.621
Final published version

View graph of relations

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Chapter

Published

Hossein Rahmani
Mohammed Bennamoun

More...

Publication date	22/12/2017
Host publication	Proceedings of the IEEE International Conference on Computer Vision
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	5833-5842
Number of pages	10
ISBN (print)	9781538610329
<mark>Original language</mark>	English

Publication series

Name	Proceedings of the IEEE International Conference on Computer Vision
Volume	2017-October

Abstract

Depth sensors open up possibilities of dealing with the human action recognition problem by providing 3D human skeleton data and depth images of the scene. Analysis of hu- man actions based on 3D skeleton data has become popular recently, due to its robustness and view-invariant represen- tation. However, the skeleton alone is insufficient to distin- guish actions which involve human-object interactions. In this paper, we propose a deep model which efficiently mod- els human-object interactions and intra-class variations un- der viewpoint changes. First, a human body-part model is introduced to transfer the depth appearances of body-parts to a shared view-invariant space. Second, an end-to-end learning framework is proposed which is able to effectively combine the view-invariant body-part representation from skeletal and depth images, and learn the relations between the human body-parts and the environmental objects, the interactions between different human body-parts, and the temporal structure of human actions. We have evaluated the performance of our proposed model against 15 existing techniques on two large benchmark human action recogni- tion datasets including NTU RGB+D and UWA3DII. The Experimental results show that our technique provides a significant improvement over state-of-the-art methods. 1.

Research

Associated organisational unit

Electronic data

Links

Text available via DOI: