DIMBA: discretely masked black-box attack in single object tracking

Computing and Communications

Text available via DOI:

https://doi.org/10.1007/s10994-022-06252-2
Final published version
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Keywords

Adversarial example, Black-box attack, Visual object tracking

View graph of relations

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Standard

DIMBA: discretely masked black-box attack in single object tracking. / Yin, Xiangyu; Ruan, Wenjie; Fieldsend, Jonathan.
In: Machine Learning, Vol. 113, No. 4, 01.04.2024, p. 1705-1723.

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Harvard

Yin, X, Ruan, W & Fieldsend, J 2024, 'DIMBA: discretely masked black-box attack in single object tracking', Machine Learning, vol. 113, no. 4, pp. 1705-1723. https://doi.org/10.1007/s10994-022-06252-2

APA

Yin, X., Ruan, W., & Fieldsend, J. (2024). DIMBA: discretely masked black-box attack in single object tracking. Machine Learning, 113(4), 1705-1723. https://doi.org/10.1007/s10994-022-06252-2

Vancouver

Yin X, Ruan W, Fieldsend J. DIMBA: discretely masked black-box attack in single object tracking. Machine Learning. 2024 Apr 1;113(4):1705-1723. Epub 2022 Oct 31. doi: 10.1007/s10994-022-06252-2

Author

Yin, Xiangyu ; Ruan, Wenjie ; Fieldsend, Jonathan. / DIMBA: discretely masked black-box attack in single object tracking. In: Machine Learning. 2024 ; Vol. 113, No. 4. pp. 1705-1723.

Bibtex

@article{5f76abd1f92b40ff8a7e9f79039924f6,

title = "DIMBA: discretely masked black-box attack in single object tracking",

abstract = "The adversarial attack can force a CNN-based model to produce an incorrect output by craftily manipulating human-imperceptible input. Exploring such perturbations can help us gain a deeper understanding of the vulnerability of neural networks, and provide robustness to deep learning against miscellaneous adversaries. Despite extensive studies focusing on the robustness of image, audio, and NLP, works on adversarial examples of visual object tracking—especially in a black-box manner—are quite lacking. In this paper, we propose a novel adversarial attack method to generate noises for single object tracking under black-box settings, where perturbations are merely added on initialized frames of tracking sequences, which is difficult to be noticed from the perspective of a whole video clip. Specifically, we divide our algorithm into three components and exploit reinforcement learning for localizing important frame patches precisely while reducing unnecessary computational queries overhead. Compared to existing techniques, our method requires less time to perturb videos, but to manipulate competitive or even better adversarial performance. We test our algorithm in both long-term and short-term datasets, including OTB100, VOT2018, UAV123, and LaSOT. Extensive experiments demonstrate the effectiveness of our method on three mainstream types of trackers: discrimination, Siamese-based, and reinforcement learning-based trackers. We release our attack tool, DIMBA, via GitHub https://github.com/TrustAI/DIMBA for use by the community.",

keywords = "Adversarial example, Black-box attack, Visual object tracking",

author = "Xiangyu Yin and Wenjie Ruan and Jonathan Fieldsend",

year = "2024",

month = apr,

day = "1",

doi = "10.1007/s10994-022-06252-2",

language = "English",

volume = "113",

pages = "1705--1723",

journal = "Machine Learning",

issn = "0885-6125",

publisher = "Springer Netherlands",

number = "4",

}

RIS

TY - JOUR

T1 - DIMBA: discretely masked black-box attack in single object tracking

AU - Yin, Xiangyu

AU - Ruan, Wenjie

AU - Fieldsend, Jonathan

PY - 2024/4/1

Y1 - 2024/4/1

N2 - The adversarial attack can force a CNN-based model to produce an incorrect output by craftily manipulating human-imperceptible input. Exploring such perturbations can help us gain a deeper understanding of the vulnerability of neural networks, and provide robustness to deep learning against miscellaneous adversaries. Despite extensive studies focusing on the robustness of image, audio, and NLP, works on adversarial examples of visual object tracking—especially in a black-box manner—are quite lacking. In this paper, we propose a novel adversarial attack method to generate noises for single object tracking under black-box settings, where perturbations are merely added on initialized frames of tracking sequences, which is difficult to be noticed from the perspective of a whole video clip. Specifically, we divide our algorithm into three components and exploit reinforcement learning for localizing important frame patches precisely while reducing unnecessary computational queries overhead. Compared to existing techniques, our method requires less time to perturb videos, but to manipulate competitive or even better adversarial performance. We test our algorithm in both long-term and short-term datasets, including OTB100, VOT2018, UAV123, and LaSOT. Extensive experiments demonstrate the effectiveness of our method on three mainstream types of trackers: discrimination, Siamese-based, and reinforcement learning-based trackers. We release our attack tool, DIMBA, via GitHub https://github.com/TrustAI/DIMBA for use by the community.

AB - The adversarial attack can force a CNN-based model to produce an incorrect output by craftily manipulating human-imperceptible input. Exploring such perturbations can help us gain a deeper understanding of the vulnerability of neural networks, and provide robustness to deep learning against miscellaneous adversaries. Despite extensive studies focusing on the robustness of image, audio, and NLP, works on adversarial examples of visual object tracking—especially in a black-box manner—are quite lacking. In this paper, we propose a novel adversarial attack method to generate noises for single object tracking under black-box settings, where perturbations are merely added on initialized frames of tracking sequences, which is difficult to be noticed from the perspective of a whole video clip. Specifically, we divide our algorithm into three components and exploit reinforcement learning for localizing important frame patches precisely while reducing unnecessary computational queries overhead. Compared to existing techniques, our method requires less time to perturb videos, but to manipulate competitive or even better adversarial performance. We test our algorithm in both long-term and short-term datasets, including OTB100, VOT2018, UAV123, and LaSOT. Extensive experiments demonstrate the effectiveness of our method on three mainstream types of trackers: discrimination, Siamese-based, and reinforcement learning-based trackers. We release our attack tool, DIMBA, via GitHub https://github.com/TrustAI/DIMBA for use by the community.

KW - Adversarial example

KW - Black-box attack

KW - Visual object tracking

U2 - 10.1007/s10994-022-06252-2

DO - 10.1007/s10994-022-06252-2

M3 - Journal article

VL - 113

SP - 1705

EP - 1723

JO - Machine Learning

JF - Machine Learning

SN - 0885-6125

IS - 4

ER -

Research

Links

Text available via DOI:

Keywords