Home > Research > Publications & Outputs > Generalizing universal adversarial perturbation...

Links

Text available via DOI:

View graph of relations

Generalizing universal adversarial perturbations for deep neural networks

Research output: Contribution to Journal/MagazineJournal articlepeer-review

E-pub ahead of print

Standard

Generalizing universal adversarial perturbations for deep neural networks. / Zhang, Yanghao; Ruan, Wenjie; Wang, Fu et al.
In: Machine Learning, 22.03.2023.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

APA

Vancouver

Zhang Y, Ruan W, Wang F, Huang X. Generalizing universal adversarial perturbations for deep neural networks. Machine Learning. 2023 Mar 22. Epub 2023 Mar 22. doi: 10.1007/s10994-023-06306-z

Author

Bibtex

@article{ab3d0956e4874665aa1fe8908f7f0b10,
title = "Generalizing universal adversarial perturbations for deep neural networks",
abstract = "Previous studies have shown that universal adversarial attacks can fool deep neural networks over a large set of input images with a single human-invisible perturbation. However, current methods for universal adversarial attacks are based on additive perturbation, which enables misclassification by directly adding the perturbation on the input images. In this paper, for the first time, we show that a universal adversarial attack can also be achieved through spatial transformation (non-additive). More importantly, to unify both additive and non-additive perturbations, we propose a novel unified yet flexible framework for universal adversarial attacks, called GUAP, which can initiate attacks by ℓ∞-norm (additive) perturbation, spatially-transformed (non-additive) perturbation, or a combination of both. Extensive experiments are conducted on two computer vision scenarios, including image classification and semantic segmentation tasks, which contain CIFAR-10, ImageNet and Cityscapes datasets with a number of different deep neural network models, including GoogLeNet, VGG16/19, ResNet101/152, DenseNet121, and FCN-8s. Empirical experiments demonstrate that GUAP can obtain higher attack success rates on these datasets compared to state-of-the-art universal adversarial attacks. In addition, we also demonstrate how universal adversarial training benefits the robustness of the model against universal attacks. We release our tool GUAP on https://github.com/TrustAI/GUAP.",
keywords = "Deep learning, Adversarial examples, Security, Deep neural networks",
author = "Yanghao Zhang and Wenjie Ruan and Fu Wang and Xiaowei Huang",
year = "2023",
month = mar,
day = "22",
doi = "10.1007/s10994-023-06306-z",
language = "English",
journal = "Machine Learning",
issn = "0885-6125",
publisher = "Springer Netherlands",

}

RIS

TY - JOUR

T1 - Generalizing universal adversarial perturbations for deep neural networks

AU - Zhang, Yanghao

AU - Ruan, Wenjie

AU - Wang, Fu

AU - Huang, Xiaowei

PY - 2023/3/22

Y1 - 2023/3/22

N2 - Previous studies have shown that universal adversarial attacks can fool deep neural networks over a large set of input images with a single human-invisible perturbation. However, current methods for universal adversarial attacks are based on additive perturbation, which enables misclassification by directly adding the perturbation on the input images. In this paper, for the first time, we show that a universal adversarial attack can also be achieved through spatial transformation (non-additive). More importantly, to unify both additive and non-additive perturbations, we propose a novel unified yet flexible framework for universal adversarial attacks, called GUAP, which can initiate attacks by ℓ∞-norm (additive) perturbation, spatially-transformed (non-additive) perturbation, or a combination of both. Extensive experiments are conducted on two computer vision scenarios, including image classification and semantic segmentation tasks, which contain CIFAR-10, ImageNet and Cityscapes datasets with a number of different deep neural network models, including GoogLeNet, VGG16/19, ResNet101/152, DenseNet121, and FCN-8s. Empirical experiments demonstrate that GUAP can obtain higher attack success rates on these datasets compared to state-of-the-art universal adversarial attacks. In addition, we also demonstrate how universal adversarial training benefits the robustness of the model against universal attacks. We release our tool GUAP on https://github.com/TrustAI/GUAP.

AB - Previous studies have shown that universal adversarial attacks can fool deep neural networks over a large set of input images with a single human-invisible perturbation. However, current methods for universal adversarial attacks are based on additive perturbation, which enables misclassification by directly adding the perturbation on the input images. In this paper, for the first time, we show that a universal adversarial attack can also be achieved through spatial transformation (non-additive). More importantly, to unify both additive and non-additive perturbations, we propose a novel unified yet flexible framework for universal adversarial attacks, called GUAP, which can initiate attacks by ℓ∞-norm (additive) perturbation, spatially-transformed (non-additive) perturbation, or a combination of both. Extensive experiments are conducted on two computer vision scenarios, including image classification and semantic segmentation tasks, which contain CIFAR-10, ImageNet and Cityscapes datasets with a number of different deep neural network models, including GoogLeNet, VGG16/19, ResNet101/152, DenseNet121, and FCN-8s. Empirical experiments demonstrate that GUAP can obtain higher attack success rates on these datasets compared to state-of-the-art universal adversarial attacks. In addition, we also demonstrate how universal adversarial training benefits the robustness of the model against universal attacks. We release our tool GUAP on https://github.com/TrustAI/GUAP.

KW - Deep learning

KW - Adversarial examples

KW - Security

KW - Deep neural networks

U2 - 10.1007/s10994-023-06306-z

DO - 10.1007/s10994-023-06306-z

M3 - Journal article

JO - Machine Learning

JF - Machine Learning

SN - 0885-6125

ER -