Home > Research > Publications & Outputs > PUDD: Towards Robust Multi-modal Prototype-base...

Electronic data

  • PUDD

    Accepted author manuscript, 3.53 MB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

View graph of relations

PUDD: Towards Robust Multi-modal Prototype-based Deepfake Detection

Research output: Contribution to conference - Without ISBN/ISSN Conference paperpeer-review

Forthcoming
Publication date6/03/2024
Number of pages9
<mark>Original language</mark>English
EventIEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) 2024: 2nd Workshop and Challenge on DeepFake Analysis and Detection - Seattle, United States
Duration: 19/06/202421/06/2024
https://cvpr.thecvf.com

Conference

ConferenceIEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) 2024
Abbreviated titleCVPR 2024
Country/TerritoryUnited States
CitySeattle
Period19/06/2421/06/24
Internet address

Abstract

Deepfake techniques generate highly realistic data, mak- ing it challenging for humans to discern between actual and artificially generated images. Recent advancements in deep learning-based deepfake detection methods, particularly with diffusion models, have shown remarkable progress. However, there is a growing demand for real-world appli- cations to detect unseen individuals, deepfake techniques, and scenarios. To address this limitation, we propose a Prototype-based Unified Framework for Deepfake Detec- tion (PUDD). PUDD offers a detection system based on similarity, comparing input data against known prototypes for video classification and identifying potential deepfakes or previously unseen classes by analyzing drops in similar- ity. Our extensive experiments reveal three key findings: (1) PUDD achieves an accuracy of 95.1% on Celeb-DF, out- performing state-of-the-art deepfake detection methods; (2) PUDD leverages image classification as the upstream task during training, demonstrating promising performance in both image classification and deepfake detection tasks dur- ing inference; (3) PUDD requires only 2.7 seconds for re- training on new data and emits 105 times less carbon com- pared to the state-of-the-art model, making it significantly more environmentally friendly.