Home > Research > Publications & Outputs > 3D action recognition from novel viewpoints

Associated organisational unit

View graph of relations

3D action recognition from novel viewpoints

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNChapter

Published

Standard

3D action recognition from novel viewpoints. / Rahmani, Hossein; Mian, Ajmal.
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016. p. 1506-1515 (2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)).

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNChapter

Harvard

Rahmani, H & Mian, A 2016, 3D action recognition from novel viewpoints. in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, pp. 1506-1515. https://doi.org/10.1109/CVPR.2016.167

APA

Rahmani, H., & Mian, A. (2016). 3D action recognition from novel viewpoints. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 1506-1515). (2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)). IEEE. https://doi.org/10.1109/CVPR.2016.167

Vancouver

Rahmani H, Mian A. 3D action recognition from novel viewpoints. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. 2016. p. 1506-1515. (2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)). doi: 10.1109/CVPR.2016.167

Author

Rahmani, Hossein ; Mian, Ajmal. / 3D action recognition from novel viewpoints. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2016. pp. 1506-1515 (2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)).

Bibtex

@inbook{baddb115e28b47dfbe8cdfee783e6f2d,
title = "3D action recognition from novel viewpoints",
abstract = "We propose a human pose representation model that transfers human poses acquired from different unknown views to a view-invariant high-level space. The model is a deep convolutional neural network and requires a large corpus of multiview training data which is very expensive to acquire. Therefore, we propose a method to generate this data by fitting synthetic 3D human models to real motion capture data and rendering the human poses from numerous viewpoints. While learning the CNN model, we do not use action labels but only the pose labels after clustering all training poses into k clusters. The proposed model is able to generalize to real depth images of unseen poses without the need for re-training or fine-tuning. Real depth videos are passed through the model frame-wise to extract view-invariant features. For spatio-temporal representation, we propose group sparse Fourier Temporal Pyramid which robustly encodes the action specific most discriminative output features of the proposed human pose model. Experiments on two multiview and three single-view benchmark datasets show that the proposed method dramatically outperforms existing state-of-the-art in action recognition.",
author = "Hossein Rahmani and Ajmal Mian",
year = "2016",
month = jun,
day = "27",
doi = "10.1109/CVPR.2016.167",
language = "English",
isbn = "9781467388511",
series = "2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)",
publisher = "IEEE",
pages = "1506--1515",
booktitle = "2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)",

}

RIS

TY - CHAP

T1 - 3D action recognition from novel viewpoints

AU - Rahmani, Hossein

AU - Mian, Ajmal

PY - 2016/6/27

Y1 - 2016/6/27

N2 - We propose a human pose representation model that transfers human poses acquired from different unknown views to a view-invariant high-level space. The model is a deep convolutional neural network and requires a large corpus of multiview training data which is very expensive to acquire. Therefore, we propose a method to generate this data by fitting synthetic 3D human models to real motion capture data and rendering the human poses from numerous viewpoints. While learning the CNN model, we do not use action labels but only the pose labels after clustering all training poses into k clusters. The proposed model is able to generalize to real depth images of unseen poses without the need for re-training or fine-tuning. Real depth videos are passed through the model frame-wise to extract view-invariant features. For spatio-temporal representation, we propose group sparse Fourier Temporal Pyramid which robustly encodes the action specific most discriminative output features of the proposed human pose model. Experiments on two multiview and three single-view benchmark datasets show that the proposed method dramatically outperforms existing state-of-the-art in action recognition.

AB - We propose a human pose representation model that transfers human poses acquired from different unknown views to a view-invariant high-level space. The model is a deep convolutional neural network and requires a large corpus of multiview training data which is very expensive to acquire. Therefore, we propose a method to generate this data by fitting synthetic 3D human models to real motion capture data and rendering the human poses from numerous viewpoints. While learning the CNN model, we do not use action labels but only the pose labels after clustering all training poses into k clusters. The proposed model is able to generalize to real depth images of unseen poses without the need for re-training or fine-tuning. Real depth videos are passed through the model frame-wise to extract view-invariant features. For spatio-temporal representation, we propose group sparse Fourier Temporal Pyramid which robustly encodes the action specific most discriminative output features of the proposed human pose model. Experiments on two multiview and three single-view benchmark datasets show that the proposed method dramatically outperforms existing state-of-the-art in action recognition.

U2 - 10.1109/CVPR.2016.167

DO - 10.1109/CVPR.2016.167

M3 - Chapter

SN - 9781467388511

T3 - 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

SP - 1506

EP - 1515

BT - 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

PB - IEEE

ER -