fBERT: A Neural Transformer for Identifying Offensive Content

Computing and Communications

Text available via DOI:

https://doi.org/10.18653/v1/2021.findings-emnlp.154
Final published version
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Standard

fBERT: A Neural Transformer for Identifying Offensive Content. / Sarkar, Diptanu; Zampieri, Marcos; Ranasinghe, Tharindu et al.
Findings of the Association for Computational Linguistics: EMNLP 2021. Stroudsburg, PA: Association for Computational Linguistics, 2021. p. 1792-1798.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Harvard

Sarkar, D, Zampieri, M, Ranasinghe, T & Ororbia, A 2021, fBERT: A Neural Transformer for Identifying Offensive Content. in Findings of the Association for Computational Linguistics: EMNLP 2021. Association for Computational Linguistics, Stroudsburg, PA, pp. 1792-1798, The 2021 Conference on Empirical Methods in Natural Language Processing, Punta Cana, Dominican Republic, 7/11/21. https://doi.org/10.18653/v1/2021.findings-emnlp.154

APA

Sarkar, D., Zampieri, M., Ranasinghe, T., & Ororbia, A. (2021). fBERT: A Neural Transformer for Identifying Offensive Content. In Findings of the Association for Computational Linguistics: EMNLP 2021 (pp. 1792-1798). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.findings-emnlp.154

Vancouver

Sarkar D, Zampieri M, Ranasinghe T, Ororbia A. fBERT: A Neural Transformer for Identifying Offensive Content. In Findings of the Association for Computational Linguistics: EMNLP 2021. Stroudsburg, PA: Association for Computational Linguistics. 2021. p. 1792-1798 doi: 10.18653/v1/2021.findings-emnlp.154

Author

Sarkar, Diptanu ; Zampieri, Marcos ; Ranasinghe, Tharindu et al. / fBERT: A Neural Transformer for Identifying Offensive Content. Findings of the Association for Computational Linguistics: EMNLP 2021. Stroudsburg, PA : Association for Computational Linguistics, 2021. pp. 1792-1798

Bibtex

@inproceedings{471b277caeb046a681089014a8ec2371,

title = "fBERT: A Neural Transformer for Identifying Offensive Content",

abstract = "Transformer-based models such as BERT, XLNET, and XLM-R have achieved state-of-the-art performance across various NLP tasks including the identification of offensive language and hate speech, an important problem in social media. In this paper, we present fBERT, a BERT model retrained on SOLID, the largest English offensive language identification corpus available with over 1.4 million offensive instances. We evaluate fBERT{\textquoteright}s performance on identifying offensive content on multiple English datasets and we test several thresholds for selecting instances from SOLID. The fBERT model will be made freely available to the community.",

author = "Diptanu Sarkar and Marcos Zampieri and Tharindu Ranasinghe and Alex Ororbia",

year = "2021",

month = nov,

day = "7",

doi = "10.18653/v1/2021.findings-emnlp.154",

language = "English",

pages = "1792--1798",

booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",

publisher = "Association for Computational Linguistics",

note = "The 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021 ; Conference date: 07-11-2021 Through 11-11-2021",

url = "https://2021.emnlp.org/",

}

RIS

TY - GEN

T1 - fBERT: A Neural Transformer for Identifying Offensive Content

AU - Sarkar, Diptanu

AU - Zampieri, Marcos

AU - Ranasinghe, Tharindu

AU - Ororbia, Alex

PY - 2021/11/7

Y1 - 2021/11/7

N2 - Transformer-based models such as BERT, XLNET, and XLM-R have achieved state-of-the-art performance across various NLP tasks including the identification of offensive language and hate speech, an important problem in social media. In this paper, we present fBERT, a BERT model retrained on SOLID, the largest English offensive language identification corpus available with over 1.4 million offensive instances. We evaluate fBERT’s performance on identifying offensive content on multiple English datasets and we test several thresholds for selecting instances from SOLID. The fBERT model will be made freely available to the community.

AB - Transformer-based models such as BERT, XLNET, and XLM-R have achieved state-of-the-art performance across various NLP tasks including the identification of offensive language and hate speech, an important problem in social media. In this paper, we present fBERT, a BERT model retrained on SOLID, the largest English offensive language identification corpus available with over 1.4 million offensive instances. We evaluate fBERT’s performance on identifying offensive content on multiple English datasets and we test several thresholds for selecting instances from SOLID. The fBERT model will be made freely available to the community.

U2 - 10.18653/v1/2021.findings-emnlp.154

DO - 10.18653/v1/2021.findings-emnlp.154

M3 - Conference contribution/Paper

SP - 1792

EP - 1798

BT - Findings of the Association for Computational Linguistics: EMNLP 2021

PB - Association for Computational Linguistics

CY - Stroudsburg, PA

T2 - The 2021 Conference on Empirical Methods in Natural Language Processing

Y2 - 7 November 2021 through 11 November 2021

ER -

Research

Links

Text available via DOI: