Home > Research > Publications & Outputs > Meta Spatio-Temporal Debiasing for Video Scene ...

Links

Text available via DOI:

View graph of relations

Meta Spatio-Temporal Debiasing for Video Scene Graph Generation

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

Meta Spatio-Temporal Debiasing for Video Scene Graph Generation. / Xu, Li; Qu, Haoxuan; Kuen, Jason et al.
Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings. ed. / Shai Avidan; Gabriel Brostow; Moustapha Cissé; Giovanni Maria Farinella; Tal Hassner. Cham: Springer, 2022. p. 374-390 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13687 LNCS).

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Xu, L, Qu, H, Kuen, J, Gu, J & Liu, J 2022, Meta Spatio-Temporal Debiasing for Video Scene Graph Generation. in S Avidan, G Brostow, M Cissé, GM Farinella & T Hassner (eds), Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 13687 LNCS, Springer, Cham, pp. 374-390, 17th European Conference on Computer Vision, ECCV 2022, Tel Aviv, Israel, 23/10/22. https://doi.org/10.1007/978-3-031-19812-0_22

APA

Xu, L., Qu, H., Kuen, J., Gu, J., & Liu, J. (2022). Meta Spatio-Temporal Debiasing for Video Scene Graph Generation. In S. Avidan, G. Brostow, M. Cissé, G. M. Farinella, & T. Hassner (Eds.), Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings (pp. 374-390). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 13687 LNCS). Springer. https://doi.org/10.1007/978-3-031-19812-0_22

Vancouver

Xu L, Qu H, Kuen J, Gu J, Liu J. Meta Spatio-Temporal Debiasing for Video Scene Graph Generation. In Avidan S, Brostow G, Cissé M, Farinella GM, Hassner T, editors, Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings. Cham: Springer. 2022. p. 374-390. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-031-19812-0_22

Author

Xu, Li ; Qu, Haoxuan ; Kuen, Jason et al. / Meta Spatio-Temporal Debiasing for Video Scene Graph Generation. Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings. editor / Shai Avidan ; Gabriel Brostow ; Moustapha Cissé ; Giovanni Maria Farinella ; Tal Hassner. Cham : Springer, 2022. pp. 374-390 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

Bibtex

@inproceedings{51cb64e89ce34d5d94904719d830fe9c,
title = "Meta Spatio-Temporal Debiasing for Video Scene Graph Generation",
abstract = "Video scene graph generation (VidSGG) aims to parse the video content into scene graphs, which involves modeling the spatio-temporal contextual information in the video. However, due to the long-tailed training data in datasets, the generalization performance of existing VidSGG models can be affected by the spatio-temporal conditional bias problem. In this work, from the perspective of meta-learning, we propose a novel Meta Video Scene Graph Generation (MVSGG) framework to address such a bias problem. Specifically, to handle various types of spatio-temporal conditional biases, our framework first constructs a support set and a group of query sets from the training data, where the data distribution of each query set is different from that of the support set w.r.t. a type of conditional bias. Then, by performing a novel meta training and testing process to optimize the model to obtain good testing performance on these query sets after training on the support set, our framework can effectively guide the model to learn to well generalize against biases. Extensive experiments demonstrate the efficacy of our proposed framework.",
keywords = "Long-tailed bias, Meta learning, VidSGG",
author = "Li Xu and Haoxuan Qu and Jason Kuen and Jiuxiang Gu and Jun Liu",
year = "2022",
month = oct,
day = "30",
doi = "10.1007/978-3-031-19812-0_22",
language = "English",
isbn = "9783031198113",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer",
pages = "374--390",
editor = "Shai Avidan and Gabriel Brostow and Moustapha Ciss{\'e} and Farinella, {Giovanni Maria} and Tal Hassner",
booktitle = "Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings",
note = "17th European Conference on Computer Vision, ECCV 2022 ; Conference date: 23-10-2022 Through 27-10-2022",

}

RIS

TY - GEN

T1 - Meta Spatio-Temporal Debiasing for Video Scene Graph Generation

AU - Xu, Li

AU - Qu, Haoxuan

AU - Kuen, Jason

AU - Gu, Jiuxiang

AU - Liu, Jun

PY - 2022/10/30

Y1 - 2022/10/30

N2 - Video scene graph generation (VidSGG) aims to parse the video content into scene graphs, which involves modeling the spatio-temporal contextual information in the video. However, due to the long-tailed training data in datasets, the generalization performance of existing VidSGG models can be affected by the spatio-temporal conditional bias problem. In this work, from the perspective of meta-learning, we propose a novel Meta Video Scene Graph Generation (MVSGG) framework to address such a bias problem. Specifically, to handle various types of spatio-temporal conditional biases, our framework first constructs a support set and a group of query sets from the training data, where the data distribution of each query set is different from that of the support set w.r.t. a type of conditional bias. Then, by performing a novel meta training and testing process to optimize the model to obtain good testing performance on these query sets after training on the support set, our framework can effectively guide the model to learn to well generalize against biases. Extensive experiments demonstrate the efficacy of our proposed framework.

AB - Video scene graph generation (VidSGG) aims to parse the video content into scene graphs, which involves modeling the spatio-temporal contextual information in the video. However, due to the long-tailed training data in datasets, the generalization performance of existing VidSGG models can be affected by the spatio-temporal conditional bias problem. In this work, from the perspective of meta-learning, we propose a novel Meta Video Scene Graph Generation (MVSGG) framework to address such a bias problem. Specifically, to handle various types of spatio-temporal conditional biases, our framework first constructs a support set and a group of query sets from the training data, where the data distribution of each query set is different from that of the support set w.r.t. a type of conditional bias. Then, by performing a novel meta training and testing process to optimize the model to obtain good testing performance on these query sets after training on the support set, our framework can effectively guide the model to learn to well generalize against biases. Extensive experiments demonstrate the efficacy of our proposed framework.

KW - Long-tailed bias

KW - Meta learning

KW - VidSGG

U2 - 10.1007/978-3-031-19812-0_22

DO - 10.1007/978-3-031-19812-0_22

M3 - Conference contribution/Paper

AN - SCOPUS:85142697978

SN - 9783031198113

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 374

EP - 390

BT - Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings

A2 - Avidan, Shai

A2 - Brostow, Gabriel

A2 - Cissé, Moustapha

A2 - Farinella, Giovanni Maria

A2 - Hassner, Tal

PB - Springer

CY - Cham

T2 - 17th European Conference on Computer Vision, ECCV 2022

Y2 - 23 October 2022 through 27 October 2022

ER -