Home > Research > Publications & Outputs > SUTD-TrafficQA

Links

Text available via DOI:

View graph of relations

SUTD-TrafficQA: A question answering benchmark and an efficient network for video reasoning over traffic events

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

SUTD-TrafficQA: A question answering benchmark and an efficient network for video reasoning over traffic events. / Xu, L.; Huang, H.; Liu, Jun.
2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2021.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

APA

Xu, L., Huang, H., & Liu, J. (2021). SUTD-TrafficQA: A question answering benchmark and an efficient network for video reasoning over traffic events. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) IEEE. https://doi.org/10.1109/CVPR46437.2021.00975

Vancouver

Xu L, Huang H, Liu J. SUTD-TrafficQA: A question answering benchmark and an efficient network for video reasoning over traffic events. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE. 2021 Epub 2021 Jun 20. doi: 10.1109/CVPR46437.2021.00975

Author

Xu, L. ; Huang, H. ; Liu, Jun. / SUTD-TrafficQA : A question answering benchmark and an efficient network for video reasoning over traffic events. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2021.

Bibtex

@inproceedings{8300844ecd4240bcad82d6ce2fede721,
title = "SUTD-TrafficQA: A question answering benchmark and an efficient network for video reasoning over traffic events",
abstract = "Traffic event cognition and reasoning in videos is an important task that has a wide range of applications in intelligent transportation, assisted driving, and autonomous vehicles. In this paper, we create a novel dataset, SUTD-TrafficQA (Traffic Question Answering), which takes the form of video QA based on the collected 10,080 in-the-wild videos and annotated 62,535 QA pairs, for benchmarking the cognitive capability of causal inference and event understanding models in complex traffic scenarios. Specifically, we propose 6 challenging reasoning tasks corresponding to various traffic scenarios, so as to evaluate the reasoning capability over different kinds of complex yet practical traffic events. Moreover, we propose Eclipse, a novel Efficient glimpse network via dynamic inference, in order to achieve computation-efficient and reliable video reasoning. The experiments show that our method achieves superior performance while reducing the computation cost significantly. The project page: https://github.com/SUTDCV/SUTD-TrafficQA.",
author = "L. Xu and H. Huang and Jun Liu",
year = "2021",
month = nov,
day = "2",
doi = "10.1109/CVPR46437.2021.00975",
language = "English",
isbn = "9781665445108",
booktitle = "2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)",
publisher = "IEEE",

}

RIS

TY - GEN

T1 - SUTD-TrafficQA

T2 - A question answering benchmark and an efficient network for video reasoning over traffic events

AU - Xu, L.

AU - Huang, H.

AU - Liu, Jun

PY - 2021/11/2

Y1 - 2021/11/2

N2 - Traffic event cognition and reasoning in videos is an important task that has a wide range of applications in intelligent transportation, assisted driving, and autonomous vehicles. In this paper, we create a novel dataset, SUTD-TrafficQA (Traffic Question Answering), which takes the form of video QA based on the collected 10,080 in-the-wild videos and annotated 62,535 QA pairs, for benchmarking the cognitive capability of causal inference and event understanding models in complex traffic scenarios. Specifically, we propose 6 challenging reasoning tasks corresponding to various traffic scenarios, so as to evaluate the reasoning capability over different kinds of complex yet practical traffic events. Moreover, we propose Eclipse, a novel Efficient glimpse network via dynamic inference, in order to achieve computation-efficient and reliable video reasoning. The experiments show that our method achieves superior performance while reducing the computation cost significantly. The project page: https://github.com/SUTDCV/SUTD-TrafficQA.

AB - Traffic event cognition and reasoning in videos is an important task that has a wide range of applications in intelligent transportation, assisted driving, and autonomous vehicles. In this paper, we create a novel dataset, SUTD-TrafficQA (Traffic Question Answering), which takes the form of video QA based on the collected 10,080 in-the-wild videos and annotated 62,535 QA pairs, for benchmarking the cognitive capability of causal inference and event understanding models in complex traffic scenarios. Specifically, we propose 6 challenging reasoning tasks corresponding to various traffic scenarios, so as to evaluate the reasoning capability over different kinds of complex yet practical traffic events. Moreover, we propose Eclipse, a novel Efficient glimpse network via dynamic inference, in order to achieve computation-efficient and reliable video reasoning. The experiments show that our method achieves superior performance while reducing the computation cost significantly. The project page: https://github.com/SUTDCV/SUTD-TrafficQA.

U2 - 10.1109/CVPR46437.2021.00975

DO - 10.1109/CVPR46437.2021.00975

M3 - Conference contribution/Paper

SN - 9781665445108

BT - 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

PB - IEEE

ER -