Vicarious Offense and Noise Audit of Offensive Speech Classifiers - Research Portal

Computing and Communications

Text available via DOI:

https://doi.org/10.18653/v1/2023.emnlp-main.713
Final published version
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Standard

Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive. / Weerasooriya, Tharindu ; Dutta, Sujan; Ranasinghe, Tharindu et al.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics, 2023. p. 11648-11668.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Harvard

Weerasooriya, T, Dutta, S, Ranasinghe, T, Zampieri, M, Homan, C & KhudaBukhsh, A 2023, Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive. in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Stroudsburg, PA, pp. 11648-11668, The 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, Singapore, 6/12/23. https://doi.org/10.18653/v1/2023.emnlp-main.713

APA

Weerasooriya, T., Dutta, S., Ranasinghe, T., Zampieri, M., Homan, C., & KhudaBukhsh, A. (2023). Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (pp. 11648-11668). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.emnlp-main.713

Vancouver

Weerasooriya T, Dutta S, Ranasinghe T, Zampieri M, Homan C, KhudaBukhsh A. Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA: Association for Computational Linguistics. 2023. p. 11648-11668 doi: 10.18653/v1/2023.emnlp-main.713

Author

Weerasooriya, Tharindu ; Dutta, Sujan ; Ranasinghe, Tharindu et al. / Vicarious Offense and Noise Audit of Offensive Speech Classifiers : Unifying Human and Machine Disagreement on What is Offensive. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, PA : Association for Computational Linguistics, 2023. pp. 11648-11668

Bibtex

@inproceedings{79a4c22bb45b4944ba677a94a23ac82e,

title = "Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive",

abstract = "Offensive speech detection is a key component of content moderation. However, what is offensive can be highly subjective. This paper investigates how machine and human moderators disagree on what is offensive when it comes to real-world social web political discourse. We show that (1) there is extensive disagreement among the moderators (humans and machines); and (2) human and large-language-model classifiers are unable to predict how other human raters will respond, based on their political leanings. For (1), we conduct a ***noise audit*** at an unprecedented scale that combines both machine and human responses. For (2), we introduce a first-of-its-kind dataset of ***vicarious offense***. Our noise audit reveals that moderation outcomes vary wildly across different machine moderators. Our experiments with human moderators suggest that political leanings combined with sensitive issues affect both first-person and vicarious offense. The dataset is available through https://github.com/Homan-Lab/voiced.",

author = "Tharindu Weerasooriya and Sujan Dutta and Tharindu Ranasinghe and Marcos Zampieri and Christopher Homan and Ashiqur KhudaBukhsh",

year = "2023",

month = dec,

day = "6",

doi = "10.18653/v1/2023.emnlp-main.713",

language = "English",

pages = "11648--11668",

booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing",

publisher = "Association for Computational Linguistics",

note = "The 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023 ; Conference date: 06-12-2023 Through 10-12-2023",

url = "https://2023.emnlp.org/",

}

RIS

TY - GEN

T1 - Vicarious Offense and Noise Audit of Offensive Speech Classifiers

T2 - The 2023 Conference on Empirical Methods in Natural Language Processing

AU - Weerasooriya, Tharindu

AU - Dutta, Sujan

AU - Ranasinghe, Tharindu

AU - Zampieri, Marcos

AU - Homan, Christopher

AU - KhudaBukhsh, Ashiqur

PY - 2023/12/6

Y1 - 2023/12/6

N2 - Offensive speech detection is a key component of content moderation. However, what is offensive can be highly subjective. This paper investigates how machine and human moderators disagree on what is offensive when it comes to real-world social web political discourse. We show that (1) there is extensive disagreement among the moderators (humans and machines); and (2) human and large-language-model classifiers are unable to predict how other human raters will respond, based on their political leanings. For (1), we conduct a ***noise audit*** at an unprecedented scale that combines both machine and human responses. For (2), we introduce a first-of-its-kind dataset of ***vicarious offense***. Our noise audit reveals that moderation outcomes vary wildly across different machine moderators. Our experiments with human moderators suggest that political leanings combined with sensitive issues affect both first-person and vicarious offense. The dataset is available through https://github.com/Homan-Lab/voiced.

AB - Offensive speech detection is a key component of content moderation. However, what is offensive can be highly subjective. This paper investigates how machine and human moderators disagree on what is offensive when it comes to real-world social web political discourse. We show that (1) there is extensive disagreement among the moderators (humans and machines); and (2) human and large-language-model classifiers are unable to predict how other human raters will respond, based on their political leanings. For (1), we conduct a ***noise audit*** at an unprecedented scale that combines both machine and human responses. For (2), we introduce a first-of-its-kind dataset of ***vicarious offense***. Our noise audit reveals that moderation outcomes vary wildly across different machine moderators. Our experiments with human moderators suggest that political leanings combined with sensitive issues affect both first-person and vicarious offense. The dataset is available through https://github.com/Homan-Lab/voiced.

U2 - 10.18653/v1/2023.emnlp-main.713

DO - 10.18653/v1/2023.emnlp-main.713

M3 - Conference contribution/Paper

SP - 11648

EP - 11668

BT - Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

PB - Association for Computational Linguistics

CY - Stroudsburg, PA

Y2 - 6 December 2023 through 10 December 2023

ER -

Research

Links

Text available via DOI:

Vicarious Offense and Noise Audit of Offensive Speech Classifiers: Unifying Human and Machine Disagreement on What is Offensive

Standard

Harvard

APA

Vancouver

Author

Bibtex

RIS

Quick Links

Connect With Us

Faculties & Depts

Contact Us