Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Inference-Domain Network Evolution
T2 - A New Perspective for One-Shot Multi-Object Tracking
AU - Li, Rui
AU - Zhang, Baopeng
AU - Liu, Jun
AU - Liu, Wei
AU - Teng, Zhu
PY - 2023/12/31
Y1 - 2023/12/31
N2 - The supervised one-shot multi-object tracking (MOT) algorithms have achieved satisfactory performance benefiting from a large amount of labeled data. However, in real applications, acquiring plenty of laborious manual annotations is not practical. It is necessary to adapt the one-shot MOT model trained on a labeled domain to an unlabeled domain, yet such domain adaptation is a challenging problem. The main reason is that it has to detect and associate multiple moving objects distributed in various spatial locations, but there are obvious discrepancies in style, object identity, quantity, and scale among different domains. Motivated by this, we propose a novel inference-domain network evolution to enhance the generalization ability of the one-shot MOT model. Specifically, we design a spatial topology-based one-shot network (STONet) to perform the one-shot MOT task, where a self-supervision mechanism is employed to stimulate the feature extractor to learn the spatial contexts without any annotated information. Furthermore, a temporal identity aggregation (TIA) module is proposed to assist STONet to weaken the adverse effects of noisy labels in the network evolution. This designed TIA aggregates historical embeddings with the same identity to learn cleaner and more reliable pseudo labels. In the inference domain, the proposed STONet with TIA performs pseudo label collection and parameter update progressively to realize the network evolution from the labeled source domain to an unlabeled inference domain. Extensive experiments and ablation studies conducted on MOT15, MOT17, and MOT20, demonstrate the effectiveness of our proposed model.
AB - The supervised one-shot multi-object tracking (MOT) algorithms have achieved satisfactory performance benefiting from a large amount of labeled data. However, in real applications, acquiring plenty of laborious manual annotations is not practical. It is necessary to adapt the one-shot MOT model trained on a labeled domain to an unlabeled domain, yet such domain adaptation is a challenging problem. The main reason is that it has to detect and associate multiple moving objects distributed in various spatial locations, but there are obvious discrepancies in style, object identity, quantity, and scale among different domains. Motivated by this, we propose a novel inference-domain network evolution to enhance the generalization ability of the one-shot MOT model. Specifically, we design a spatial topology-based one-shot network (STONet) to perform the one-shot MOT task, where a self-supervision mechanism is employed to stimulate the feature extractor to learn the spatial contexts without any annotated information. Furthermore, a temporal identity aggregation (TIA) module is proposed to assist STONet to weaken the adverse effects of noisy labels in the network evolution. This designed TIA aggregates historical embeddings with the same identity to learn cleaner and more reliable pseudo labels. In the inference domain, the proposed STONet with TIA performs pseudo label collection and parameter update progressively to realize the network evolution from the labeled source domain to an unlabeled inference domain. Extensive experiments and ablation studies conducted on MOT15, MOT17, and MOT20, demonstrate the effectiveness of our proposed model.
U2 - 10.1109/TIP.2023.3263104
DO - 10.1109/TIP.2023.3263104
M3 - Journal article
VL - 32
SP - 2147
EP - 2159
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
SN - 1057-7149
ER -