Home > Research > Publications & Outputs > Meet JEANIE

Links

Text available via DOI:

View graph of relations

Meet JEANIE: A Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment

Research output: Contribution to Journal/MagazineJournal articlepeer-review

E-pub ahead of print

Standard

Meet JEANIE: A Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment. / Wang, Lei; Liu, Jun; Zheng, Liang et al.
In: International Journal of Computer Vision, Vol. 132, 06.05.2024, p. 4091-4122.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

Wang, L, Liu, J, Zheng, L, Gedeon, T & Koniusz, P 2024, 'Meet JEANIE: A Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment', International Journal of Computer Vision, vol. 132, pp. 4091-4122. https://doi.org/10.1007/s11263-024-02070-2

APA

Wang, L., Liu, J., Zheng, L., Gedeon, T., & Koniusz, P. (2024). Meet JEANIE: A Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment. International Journal of Computer Vision, 132, 4091-4122. Advance online publication. https://doi.org/10.1007/s11263-024-02070-2

Vancouver

Wang L, Liu J, Zheng L, Gedeon T, Koniusz P. Meet JEANIE: A Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment. International Journal of Computer Vision. 2024 May 6;132:4091-4122. Epub 2024 May 6. doi: 10.1007/s11263-024-02070-2

Author

Wang, Lei ; Liu, Jun ; Zheng, Liang et al. / Meet JEANIE : A Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment. In: International Journal of Computer Vision. 2024 ; Vol. 132. pp. 4091-4122.

Bibtex

@article{fc3b2f3c796f4121af7eec6675b2ad5a,
title = "Meet JEANIE: A Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment",
abstract = "Video sequences exhibit significant nuisance variations (undesired effects) of speed of actions, temporal locations, and subjects{\textquoteright} poses, leading to temporal-viewpoint misalignment when comparing two sets of frames or evaluating the similarity of two sequences. Thus, we propose Joint tEmporal and cAmera viewpoiNt alIgnmEnt (JEANIE) for sequence pairs. In particular, we focus on 3D skeleton sequences whose camera and subjects{\textquoteright} poses can be easily manipulated in 3D. We evaluate JEANIE on skeletal Few-shot Action Recognition (FSAR), where matching well temporal blocks (temporal chunks that make up a sequence) of support-query sequence pairs (by factoring out nuisance variations) is essential due to limited samples of novel classes. Given a query sequence, we create its several views by simulating several camera locations. For a support sequence, we match it with view-simulated query sequences, as in the popular Dynamic Time Warping (DTW). Specifically, each support temporal block can be matched to the query temporal block with the same or adjacent (next) temporal index, and adjacent camera views to achieve joint local temporal-viewpoint warping. JEANIE selects the smallest distance among matching paths with different temporal-viewpoint warping patterns, an advantage over DTW which only performs temporal alignment. We also propose an unsupervised FSAR akin to clustering of sequences with JEANIE as a distance measure. JEANIE achieves state-of-the-art results on NTU-60, NTU-120, Kinetics-skeleton and UWA3D Multiview Activity II on supervised and unsupervised FSAR, and their meta-learning inspired fusion.",
keywords = "Dictionary learning, Dynamic time warping, Few-shot action recognition, Fusion, MAML, Skeletons, Soft assignment, Sparse coding, Supervised, Unsupervised",
author = "Lei Wang and Jun Liu and Liang Zheng and Tom Gedeon and Piotr Koniusz",
year = "2024",
month = may,
day = "6",
doi = "10.1007/s11263-024-02070-2",
language = "English",
volume = "132",
pages = "4091--4122",
journal = "International Journal of Computer Vision",
issn = "0920-5691",
publisher = "Springer Netherlands",

}

RIS

TY - JOUR

T1 - Meet JEANIE

T2 - A Similarity Measure for 3D Skeleton Sequences via Temporal-Viewpoint Alignment

AU - Wang, Lei

AU - Liu, Jun

AU - Zheng, Liang

AU - Gedeon, Tom

AU - Koniusz, Piotr

PY - 2024/5/6

Y1 - 2024/5/6

N2 - Video sequences exhibit significant nuisance variations (undesired effects) of speed of actions, temporal locations, and subjects’ poses, leading to temporal-viewpoint misalignment when comparing two sets of frames or evaluating the similarity of two sequences. Thus, we propose Joint tEmporal and cAmera viewpoiNt alIgnmEnt (JEANIE) for sequence pairs. In particular, we focus on 3D skeleton sequences whose camera and subjects’ poses can be easily manipulated in 3D. We evaluate JEANIE on skeletal Few-shot Action Recognition (FSAR), where matching well temporal blocks (temporal chunks that make up a sequence) of support-query sequence pairs (by factoring out nuisance variations) is essential due to limited samples of novel classes. Given a query sequence, we create its several views by simulating several camera locations. For a support sequence, we match it with view-simulated query sequences, as in the popular Dynamic Time Warping (DTW). Specifically, each support temporal block can be matched to the query temporal block with the same or adjacent (next) temporal index, and adjacent camera views to achieve joint local temporal-viewpoint warping. JEANIE selects the smallest distance among matching paths with different temporal-viewpoint warping patterns, an advantage over DTW which only performs temporal alignment. We also propose an unsupervised FSAR akin to clustering of sequences with JEANIE as a distance measure. JEANIE achieves state-of-the-art results on NTU-60, NTU-120, Kinetics-skeleton and UWA3D Multiview Activity II on supervised and unsupervised FSAR, and their meta-learning inspired fusion.

AB - Video sequences exhibit significant nuisance variations (undesired effects) of speed of actions, temporal locations, and subjects’ poses, leading to temporal-viewpoint misalignment when comparing two sets of frames or evaluating the similarity of two sequences. Thus, we propose Joint tEmporal and cAmera viewpoiNt alIgnmEnt (JEANIE) for sequence pairs. In particular, we focus on 3D skeleton sequences whose camera and subjects’ poses can be easily manipulated in 3D. We evaluate JEANIE on skeletal Few-shot Action Recognition (FSAR), where matching well temporal blocks (temporal chunks that make up a sequence) of support-query sequence pairs (by factoring out nuisance variations) is essential due to limited samples of novel classes. Given a query sequence, we create its several views by simulating several camera locations. For a support sequence, we match it with view-simulated query sequences, as in the popular Dynamic Time Warping (DTW). Specifically, each support temporal block can be matched to the query temporal block with the same or adjacent (next) temporal index, and adjacent camera views to achieve joint local temporal-viewpoint warping. JEANIE selects the smallest distance among matching paths with different temporal-viewpoint warping patterns, an advantage over DTW which only performs temporal alignment. We also propose an unsupervised FSAR akin to clustering of sequences with JEANIE as a distance measure. JEANIE achieves state-of-the-art results on NTU-60, NTU-120, Kinetics-skeleton and UWA3D Multiview Activity II on supervised and unsupervised FSAR, and their meta-learning inspired fusion.

KW - Dictionary learning

KW - Dynamic time warping

KW - Few-shot action recognition

KW - Fusion

KW - MAML

KW - Skeletons

KW - Soft assignment

KW - Sparse coding

KW - Supervised

KW - Unsupervised

U2 - 10.1007/s11263-024-02070-2

DO - 10.1007/s11263-024-02070-2

M3 - Journal article

VL - 132

SP - 4091

EP - 4122

JO - International Journal of Computer Vision

JF - International Journal of Computer Vision

SN - 0920-5691

ER -