Home > Research > Publications & Outputs > Pinch

Electronic data

  • 2209.06300v1

    Submitted manuscript, 6.55 MB, PDF document

    Available under license: CC BY-NC-SA: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

Links

Keywords

View graph of relations

Pinch: An Adversarial Extraction Attack Framework for Deep Learning Models

Research output: Contribution to Journal/MagazineJournal article

Published
<mark>Journal publication date</mark>13/09/2022
<mark>Journal</mark>arXiv
Publication StatusPublished
<mark>Original language</mark>English

Abstract

Deep Learning (DL) models increasingly power a diversity of applications. Unfortunately, this pervasiveness also makes them attractive targets for extraction attacks which can steal the architecture, parameters, and hyper-parameters of a targeted DL model. Existing extraction attack studies have observed varying levels of attack success for different DL models and datasets, yet the underlying cause(s) behind their susceptibility often remain unclear. Ascertaining such root-cause weaknesses would help facilitate secure DL systems, though this requires studying extraction attacks in a wide variety of scenarios to identify commonalities across attack success and DL characteristics. The overwhelmingly high technical effort and time required to understand, implement, and evaluate even a single attack makes it infeasible to explore the large number of unique extraction attack scenarios in existence, with current frameworks typically designed to only operate for specific attack types, datasets and hardware platforms. In this paper we present PINCH: an efficient and automated extraction attack framework capable of deploying and evaluating multiple DL models and attacks across heterogeneous hardware platforms. We demonstrate the effectiveness of PINCH by empirically evaluating a large number of previously unexplored extraction attack scenarios, as well as secondary attack staging. Our key findings show that 1) multiple characteristics affect extraction attack success spanning DL model architecture, dataset complexity, hardware, attack type, and 2) partially successful extraction attacks significantly enhance the success of further adversarial attack staging.

Bibliographic note

15 pages, 11 figures, 2 tables