Rights statement: ©2017 IEEE. Translations and content mining are permitted for academic research only. Personal use is also permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
Accepted author manuscript, 2.9 MB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License
Final published version
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Multi-Temporal Depth Motion Maps-Based Local Binary Patterns for 3D Human Action Recognition
AU - Chen, Chen
AU - Liu, Mengyang
AU - Liu, Hong
AU - Zhang, Baochang
AU - Han, Jungong
AU - Kahtarnavaz, Nasser
N1 - ©2017 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
PY - 2017/11/7
Y1 - 2017/11/7
N2 - This paper presents a local spatio-temporal descriptor for action recognition from depth video sequences which is capable of distinguishing similar actions as well as coping with different speeds of actions. This descriptor is based on three processing stages. In the first stage, the shape and motion cues are captured from a weighted depth sequence by temporally overlapped depth segments, leading to three improved depth motion maps (DMMs) compared to previously introduced DMMs. In the second stage, the improved DMMs are partitioned into dense patches, from which the local binary patterns histogram features are extracted to characterize local rotation invariant texture information. In the final stage, a Fisher kernel is used for generating a compact feature representation, which is then combined with a kernel-based extreme learning machine (ELM) classifier. The developed solution is applied to five public domain datasets and is extensively evaluated. The results obtained demonstrate the effectiveness of this solution as compared to the existing approaches.
AB - This paper presents a local spatio-temporal descriptor for action recognition from depth video sequences which is capable of distinguishing similar actions as well as coping with different speeds of actions. This descriptor is based on three processing stages. In the first stage, the shape and motion cues are captured from a weighted depth sequence by temporally overlapped depth segments, leading to three improved depth motion maps (DMMs) compared to previously introduced DMMs. In the second stage, the improved DMMs are partitioned into dense patches, from which the local binary patterns histogram features are extracted to characterize local rotation invariant texture information. In the final stage, a Fisher kernel is used for generating a compact feature representation, which is then combined with a kernel-based extreme learning machine (ELM) classifier. The developed solution is applied to five public domain datasets and is extensively evaluated. The results obtained demonstrate the effectiveness of this solution as compared to the existing approaches.
U2 - 10.1109/ACCESS.2017.2759058
DO - 10.1109/ACCESS.2017.2759058
M3 - Journal article
VL - 5
SP - 22590
EP - 22604
JO - IEEE Access
JF - IEEE Access
SN - 2169-3536
ER -