Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates

Computing and Communications

Text available via DOI:

https://doi.org/10.1109/TPAMI.2017.2771306
Final published version

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates. / Liu, Jun; Shahroudy, A.; Xu, D. et al.
In: IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, No. 12, 31.12.2018, p. 3007-3021.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Liu, J, Shahroudy, A, Xu, D, Kot, AC & Wang, G 2018, 'Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates', IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 12, pp. 3007-3021. https://doi.org/10.1109/TPAMI.2017.2771306

APA

Liu, J., Shahroudy, A., Xu, D., Kot, A. C., & Wang, G. (2018). Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12), 3007-3021. https://doi.org/10.1109/TPAMI.2017.2771306

Vancouver

Liu J, Shahroudy A, Xu D, Kot AC, Wang G. Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2018 Dec 31;40(12):3007-3021. Epub 2017 Nov 9. doi: 10.1109/TPAMI.2017.2771306

Author

Liu, Jun ; Shahroudy, A. ; Xu, D. et al. / Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates. In: IEEE Transactions on Pattern Analysis and Machine Intelligence. 2018 ; Vol. 40, No. 12. pp. 3007-3021.

Bibtex

@article{4a751de7a510455499b686fbc7c70669,

title = "Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates",

abstract = "Skeleton-based human action recognition has attracted a lot of research attention during the past few years. Recent works attempted to utilize recurrent neural networks to model the temporal dependencies between the 3D positional configurations of human body joints for better analysis of human activities in the skeletal data. The proposed work extends this idea to spatial domain as well as temporal domain to better analyze the hidden sources of action-related information within the human skeleton sequences in both of these domains simultaneously. Based on the pictorial structure of Kinect's skeletal data, an effective tree-structure based traversal framework is also proposed. In order to deal with the noise in the skeletal data, a new gating mechanism within LSTM module is introduced, with which the network can learn the reliability of the sequential data and accordingly adjust the effect of the input data on the updating procedure of the long-term context representation stored in the unit's memory cell. Moreover, we introduce a novel multi-modal feature fusion strategy within the LSTM unit in this paper. The comprehensive experimental results on seven challenging benchmark datasets for human action recognition demonstrate the effectiveness of the proposed method.",

author = "Jun Liu and A. Shahroudy and D. Xu and A.C. Kot and G. Wang",

year = "2018",

month = dec,

day = "31",

doi = "10.1109/TPAMI.2017.2771306",

language = "English",

volume = "40",

pages = "3007--3021",

journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",

issn = "0162-8828",

publisher = "IEEE Computer Society",

number = "12",

}

RIS

TY - JOUR

T1 - Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates

AU - Liu, Jun

AU - Shahroudy, A.

AU - Xu, D.

AU - Kot, A.C.

AU - Wang, G.

PY - 2018/12/31

Y1 - 2018/12/31

N2 - Skeleton-based human action recognition has attracted a lot of research attention during the past few years. Recent works attempted to utilize recurrent neural networks to model the temporal dependencies between the 3D positional configurations of human body joints for better analysis of human activities in the skeletal data. The proposed work extends this idea to spatial domain as well as temporal domain to better analyze the hidden sources of action-related information within the human skeleton sequences in both of these domains simultaneously. Based on the pictorial structure of Kinect's skeletal data, an effective tree-structure based traversal framework is also proposed. In order to deal with the noise in the skeletal data, a new gating mechanism within LSTM module is introduced, with which the network can learn the reliability of the sequential data and accordingly adjust the effect of the input data on the updating procedure of the long-term context representation stored in the unit's memory cell. Moreover, we introduce a novel multi-modal feature fusion strategy within the LSTM unit in this paper. The comprehensive experimental results on seven challenging benchmark datasets for human action recognition demonstrate the effectiveness of the proposed method.

AB - Skeleton-based human action recognition has attracted a lot of research attention during the past few years. Recent works attempted to utilize recurrent neural networks to model the temporal dependencies between the 3D positional configurations of human body joints for better analysis of human activities in the skeletal data. The proposed work extends this idea to spatial domain as well as temporal domain to better analyze the hidden sources of action-related information within the human skeleton sequences in both of these domains simultaneously. Based on the pictorial structure of Kinect's skeletal data, an effective tree-structure based traversal framework is also proposed. In order to deal with the noise in the skeletal data, a new gating mechanism within LSTM module is introduced, with which the network can learn the reliability of the sequential data and accordingly adjust the effect of the input data on the updating procedure of the long-term context representation stored in the unit's memory cell. Moreover, we introduce a novel multi-modal feature fusion strategy within the LSTM unit in this paper. The comprehensive experimental results on seven challenging benchmark datasets for human action recognition demonstrate the effectiveness of the proposed method.

U2 - 10.1109/TPAMI.2017.2771306

DO - 10.1109/TPAMI.2017.2771306

M3 - Journal article

VL - 40

SP - 3007

EP - 3021

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

SN - 0162-8828

IS - 12

ER -

Research

Links

Text available via DOI: