Home > Research > Publications & Outputs > Momentum Contrastive Teacher for Semi-Supervise...

Associated organisational unit

Electronic data

  • paper

    Accepted author manuscript, 2.08 MB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

Text available via DOI:

View graph of relations

Momentum Contrastive Teacher for Semi-Supervised Skeleton Action Recognition

Research output: Contribution to Journal/MagazineJournal articlepeer-review

E-pub ahead of print
Close
<mark>Journal publication date</mark>31/12/2025
<mark>Journal</mark>IEEE Transactions on Image Processing
Volume34
Number of pages11
Pages (from-to)295-305
Publication StatusE-pub ahead of print
Early online date1/01/25
<mark>Original language</mark>English

Abstract

In the field of semi-supervised skeleton action recognition, existing work primarily follows the paradigm of self-supervised training followed by supervised fine-tuning. However, self-supervised learning focuses on exploring data representation rather than label classification. Inspired by Mean Teacher, we explore a novel pseudo-label-based model called SkeleMoCLR. Specifically, we use MoCo v2 as the foundation and extend it into a teacher-student network through a momentum encoder. The generation of high-confidence pseudo-labels requires a well-pretrained model as a prerequisite. In cases where large-scale skeleton data is lacking, we propose leveraging contrastive learning to transfer discriminative action features from large vision-text models to the skeleton encoder. Following the contrastive pre-training, the key encoder branch from MoCo v2 serves as the teacher to generate pseudo-labels for training the query encoder branch. Furthermore, we introduce pseudo-labels into the memory queues, sampling negative samples from different pseudo-label classes to maximize the representation differentiation between different categories. We jointly optimize the classification loss for both labeled and pseudo-labeled data and the contrastive loss for unlabeled data to update model parameters, fully harnessing the potential of pseudo-label semi-supervised learning and self-supervised learning. Extensive experiments conducted on the NTU-60, NTU-120, PKU-MMD, and NW-UCLA datasets demonstrate that our SkeleMoCLR outperforms existing competitive methods in the semi-supervised skeleton action recognition task.