Final published version
Research output: Contribution to Journal/Magazine › Journal article › peer-review
Research output: Contribution to Journal/Magazine › Journal article › peer-review
}
TY - JOUR
T1 - Robust online multi-target visual tracking using a HISP filter with discriminative deep appearance learning
AU - Baisa, N.L.
PY - 2021/5/31
Y1 - 2021/5/31
N2 - We propose a novel online multi-target visual tracker based on the recently developed Hypothesized and Independent Stochastic Population (HISP) filter. The HISP filter combines advantages of traditional tracking approaches like MHT and point-process-based approaches like PHD filter, and it has linear complexity while maintaining track identities. We apply this filter for tracking multiple targets in video sequences acquired under varying environmental conditions and targets density using a tracking-by-detection approach. We also adopt deep CNN appearance representation by training a verification-identification network (VerIdNet) on large-scale person re-identification data sets. We construct an augmented likelihood in a principled manner using this deep CNN appearance features and spatio-temporal information. Furthermore, we solve the problem of two or more targets having identical label considering the weight propagated with each confirmed hypothesis. Extensive experiments on MOT16 and MOT17 benchmark data sets show that our tracker significantly outperforms several state-of-the-art trackers in terms of tracking accuracy.
AB - We propose a novel online multi-target visual tracker based on the recently developed Hypothesized and Independent Stochastic Population (HISP) filter. The HISP filter combines advantages of traditional tracking approaches like MHT and point-process-based approaches like PHD filter, and it has linear complexity while maintaining track identities. We apply this filter for tracking multiple targets in video sequences acquired under varying environmental conditions and targets density using a tracking-by-detection approach. We also adopt deep CNN appearance representation by training a verification-identification network (VerIdNet) on large-scale person re-identification data sets. We construct an augmented likelihood in a principled manner using this deep CNN appearance features and spatio-temporal information. Furthermore, we solve the problem of two or more targets having identical label considering the weight propagated with each confirmed hypothesis. Extensive experiments on MOT16 and MOT17 benchmark data sets show that our tracker significantly outperforms several state-of-the-art trackers in terms of tracking accuracy.
KW - Appearance learning
KW - CNN
KW - HISP filter
KW - MOT challenge
KW - Multiple target filtering
KW - Online tracking
KW - Deep learning
KW - E-learning
KW - Stochastic systems
KW - Environmental conditions
KW - Linear complexity
KW - Person re identifications
KW - Spatiotemporal information
KW - Stochastic population
KW - Tracking approaches
KW - Tracking by detections
KW - Target tracking
U2 - 10.1016/j.jvcir.2020.102952
DO - 10.1016/j.jvcir.2020.102952
M3 - Journal article
VL - 77
JO - Journal of Visual Communication and Image Representation
JF - Journal of Visual Communication and Image Representation
SN - 1047-3203
M1 - 102952
ER -