Home > Research > Publications & Outputs > One-Shot Action Recognition via Multi-Scale Spa...

Links

Text available via DOI:

View graph of relations

One-Shot Action Recognition via Multi-Scale Spatial-Temporal Skeleton Matching

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published
  • Siyuan Yang
  • Jun Liu
  • Shijian Lu
  • Er Meng Hwa
  • Alex C. Kot
Close
<mark>Journal publication date</mark>31/07/2024
<mark>Journal</mark>IEEE Transactions on Pattern Analysis and Machine Intelligence
Issue number7
Volume46
Number of pages8
Pages (from-to)5149-5156
Publication StatusPublished
Early online date8/02/24
<mark>Original language</mark>English

Abstract

One-shot skeleton action recognition, which aims to learn a skeleton action recognition model with a single training sample, has attracted increasing interest due to the challenge of collecting and annotating large-scale skeleton action data. However, most existing studies match skeleton sequences by comparing their feature vectors directly which neglects spatial structures and temporal orders of skeleton data. This paper presents a novel one-shot skeleton action recognition technique that handles skeleton action recognition via multi-scale spatial-temporal feature matching. We represent skeleton data at multiple spatial and temporal scales and achieve optimal feature matching from two perspectives. The first is multi-scale matching which captures the scale-wise semantic relevance of skeleton data at multiple spatial and temporal scales simultaneously. The second is cross-scale matching which handles different motion magnitudes and speeds by capturing sample-wise relevance across multiple scales. Extensive experiments over three large-scale datasets (NTU RGB+D, NTU RGB+D 120, and PKU-MMD) show that our method achieves superior one-shot skeleton action recognition, and outperforms SOTA consistently by large margins.