Rights statement: ©2019 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
Accepted author manuscript, 3.09 MB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License
Final published version
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Deep Manifold Structure Transfer for Action Recognition
AU - Li, Ce
AU - Zhang, Baochang
AU - Chen, Chen
AU - Ye, Qixiang
AU - Han, Jungong
AU - Guo, Guodong
AU - JI, Rongrong
N1 - ©2019 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
PY - 2019/9/1
Y1 - 2019/9/1
N2 - While intrinsic data structure in subspace provides useful information for visual recognition, it has not yet been well studied in deep feature learning for action recognition. In this paper, we introduce a new spatio-temporal manifold network (STMN) that leverages data manifold structures to regularize deep action feature learning, aiming at simultaneously minimizing the intra-class variations of learned deep features and alleviating the over-fitting problem. To this end, the manifold prior is imposed from the top layer of a convolutional neural network (CNN), and is propagated across convolutional layers during forward-backward propagation. The observed correspondence of manifold structures in the data space and feature space validates that the manifold priori can be transferred across CNN layers. STMN theoretically recasts the problem of transferring the data structure prior into the deep learning architectures as a projection over the manifold via an embedding method, which can be easily solved by an Alternating Direction Method of Multipliers and Backward Propagation (ADMM-BP) algorithm. STMN is generic in the sense that it can be plugged into various backbone architectures to learn more discriminative representation for action recognition. Extensive experimental results show that our method achieves comparable or even better performance as compared with the state-of-the-art approaches on four benchmark datasets.
AB - While intrinsic data structure in subspace provides useful information for visual recognition, it has not yet been well studied in deep feature learning for action recognition. In this paper, we introduce a new spatio-temporal manifold network (STMN) that leverages data manifold structures to regularize deep action feature learning, aiming at simultaneously minimizing the intra-class variations of learned deep features and alleviating the over-fitting problem. To this end, the manifold prior is imposed from the top layer of a convolutional neural network (CNN), and is propagated across convolutional layers during forward-backward propagation. The observed correspondence of manifold structures in the data space and feature space validates that the manifold priori can be transferred across CNN layers. STMN theoretically recasts the problem of transferring the data structure prior into the deep learning architectures as a projection over the manifold via an embedding method, which can be easily solved by an Alternating Direction Method of Multipliers and Backward Propagation (ADMM-BP) algorithm. STMN is generic in the sense that it can be plugged into various backbone architectures to learn more discriminative representation for action recognition. Extensive experimental results show that our method achieves comparable or even better performance as compared with the state-of-the-art approaches on four benchmark datasets.
U2 - 10.1109/TIP.2019.2912357
DO - 10.1109/TIP.2019.2912357
M3 - Journal article
VL - 28
SP - 4646
EP - 4658
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
SN - 1057-7149
IS - 9
ER -