Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition

Computing and Communications

Associated organisational units

Electronic data

ECCV2022_Lingeng_LiTianjiao_rgb_finegrained
Accepted author manuscript, 795 KB, PDF document

View graph of relations

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

E-pub ahead of print

Standard

Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition. / Rahmani, Hossein.
Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition. Springer, 2022. (European Conference on Computer Vision (ECCV)).

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Harvard

Rahmani, H 2022, Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition. in Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition. European Conference on Computer Vision (ECCV), Springer.

APA

Rahmani, H. (2022). Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition. In Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition (European Conference on Computer Vision (ECCV)). Springer. Advance online publication.

Vancouver

Rahmani H. Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition. In Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition. Springer. 2022. (European Conference on Computer Vision (ECCV)). Epub 2022 Oct 27.

Author

Rahmani, Hossein. / Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition. Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition. Springer, 2022. (European Conference on Computer Vision (ECCV)).

Bibtex

@inproceedings{4315afee0e8f43d18ca0a4f0f54cb0da,

title = "Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition",

abstract = "The goal of fine-grained action recognition is to successfullydiscriminate between action categories with subtle differences. To tacklethis, we derive inspiration from the human visual system which containsspecialized regions in the brain that are dedicated towards handling specifictasks. We design a novel Dynamic Spatio-Temporal Specialization(DSTS) module, which consists of specialized neurons that are only activatedfor a subset of samples that are highly similar. During training,the loss forces the specialized neurons to learn discriminative fine-graineddifferences to distinguish between these similar samples, improving finegrainedrecognition. Moreover, a spatio-temporal specialization methodfurther optimizes the architectures of the specialized neurons to captureeither more spatial or temporal fine-grained information, to bettertackle the large range of spatio-temporal variations in the videos. Lastly,we design an Upstream-Downstream Learning algorithm to optimize ourmodel{\textquoteright}s dynamic decisions during training, improving the performanceof our DSTS module. We obtain state-of-the-art performance on twowidely-used fine-grained action recognition datasets.",

author = "Hossein Rahmani",

year = "2022",

month = oct,

day = "27",

language = "English",

series = "European Conference on Computer Vision (ECCV)",

publisher = "Springer",

booktitle = "Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition",

}

RIS

TY - GEN

T1 - Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition

AU - Rahmani, Hossein

PY - 2022/10/27

Y1 - 2022/10/27

N2 - The goal of fine-grained action recognition is to successfullydiscriminate between action categories with subtle differences. To tacklethis, we derive inspiration from the human visual system which containsspecialized regions in the brain that are dedicated towards handling specifictasks. We design a novel Dynamic Spatio-Temporal Specialization(DSTS) module, which consists of specialized neurons that are only activatedfor a subset of samples that are highly similar. During training,the loss forces the specialized neurons to learn discriminative fine-graineddifferences to distinguish between these similar samples, improving finegrainedrecognition. Moreover, a spatio-temporal specialization methodfurther optimizes the architectures of the specialized neurons to captureeither more spatial or temporal fine-grained information, to bettertackle the large range of spatio-temporal variations in the videos. Lastly,we design an Upstream-Downstream Learning algorithm to optimize ourmodel’s dynamic decisions during training, improving the performanceof our DSTS module. We obtain state-of-the-art performance on twowidely-used fine-grained action recognition datasets.

AB - The goal of fine-grained action recognition is to successfullydiscriminate between action categories with subtle differences. To tacklethis, we derive inspiration from the human visual system which containsspecialized regions in the brain that are dedicated towards handling specifictasks. We design a novel Dynamic Spatio-Temporal Specialization(DSTS) module, which consists of specialized neurons that are only activatedfor a subset of samples that are highly similar. During training,the loss forces the specialized neurons to learn discriminative fine-graineddifferences to distinguish between these similar samples, improving finegrainedrecognition. Moreover, a spatio-temporal specialization methodfurther optimizes the architectures of the specialized neurons to captureeither more spatial or temporal fine-grained information, to bettertackle the large range of spatio-temporal variations in the videos. Lastly,we design an Upstream-Downstream Learning algorithm to optimize ourmodel’s dynamic decisions during training, improving the performanceof our DSTS module. We obtain state-of-the-art performance on twowidely-used fine-grained action recognition datasets.

M3 - Conference contribution/Paper

T3 - European Conference on Computer Vision (ECCV)

BT - Dynamic Spatio-Temporal Specialization Learning for Fine-Grained Action Recognition

PB - Springer

ER -

Research

Associated organisational units

Electronic data