Final published version
Licence: CC BY: Creative Commons Attribution 4.0 International License
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - fBERT: A Neural Transformer for Identifying Offensive Content
AU - Sarkar, Diptanu
AU - Zampieri, Marcos
AU - Ranasinghe, Tharindu
AU - Ororbia, Alex
PY - 2021/11/7
Y1 - 2021/11/7
N2 - Transformer-based models such as BERT, XLNET, and XLM-R have achieved state-of-the-art performance across various NLP tasks including the identification of offensive language and hate speech, an important problem in social media. In this paper, we present fBERT, a BERT model retrained on SOLID, the largest English offensive language identification corpus available with over 1.4 million offensive instances. We evaluate fBERT’s performance on identifying offensive content on multiple English datasets and we test several thresholds for selecting instances from SOLID. The fBERT model will be made freely available to the community.
AB - Transformer-based models such as BERT, XLNET, and XLM-R have achieved state-of-the-art performance across various NLP tasks including the identification of offensive language and hate speech, an important problem in social media. In this paper, we present fBERT, a BERT model retrained on SOLID, the largest English offensive language identification corpus available with over 1.4 million offensive instances. We evaluate fBERT’s performance on identifying offensive content on multiple English datasets and we test several thresholds for selecting instances from SOLID. The fBERT model will be made freely available to the community.
U2 - 10.18653/v1/2021.findings-emnlp.154
DO - 10.18653/v1/2021.findings-emnlp.154
M3 - Conference contribution/Paper
SP - 1792
EP - 1798
BT - Findings of the Association for Computational Linguistics: EMNLP 2021
PB - Association for Computational Linguistics
CY - Stroudsburg, PA
T2 - The 2021 Conference on Empirical Methods in Natural Language Processing
Y2 - 7 November 2021 through 11 November 2021
ER -