Thai Defamatory Text Classification on Social Media

Computing and Communications

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Standard

Thai Defamatory Text Classification on Social Media. / Arreerard, Ratchakrit; Senivongse, Twittie.
Proceedings of the 2018 IEEE International Conference on Big Data, Cloud Computing, Data Science Engineering: Proceedings of the BCD2018. Yonago, Japan, 2018. p. 73-78.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Harvard

Arreerard, R & Senivongse, T 2018, Thai Defamatory Text Classification on Social Media. in Proceedings of the 2018 IEEE International Conference on Big Data, Cloud Computing, Data Science Engineering: Proceedings of the BCD2018. Yonago, Japan, pp. 73-78. https://doi.org/10.1109/BCD2018.2018.00019

APA

Arreerard, R., & Senivongse, T. (2018). Thai Defamatory Text Classification on Social Media. In Proceedings of the 2018 IEEE International Conference on Big Data, Cloud Computing, Data Science Engineering: Proceedings of the BCD2018 (pp. 73-78). https://doi.org/10.1109/BCD2018.2018.00019

Vancouver

Arreerard R, Senivongse T. Thai Defamatory Text Classification on Social Media. In Proceedings of the 2018 IEEE International Conference on Big Data, Cloud Computing, Data Science Engineering: Proceedings of the BCD2018. Yonago, Japan. 2018. p. 73-78 doi: 10.1109/BCD2018.2018.00019

Author

Arreerard, Ratchakrit ; Senivongse, Twittie. / Thai Defamatory Text Classification on Social Media. Proceedings of the 2018 IEEE International Conference on Big Data, Cloud Computing, Data Science Engineering: Proceedings of the BCD2018. Yonago, Japan, 2018. pp. 73-78

Bibtex

@inproceedings{5487c187363a461a944ca5fb4c89c4f0,

title = "Thai Defamatory Text Classification on Social Media",

abstract = "Development of social media has brought a huge change to social communities in several aspects. They offer a place where social media users can post information, express opinions, and share interests. However, some information and opinions may cause a negative impact on the person mentioned in the post and that person can become a target of defamation. In Thailand, although defaming someone on social media is illegal, most social media users are not aware of it. To raise awareness of this issue, this paper proposes the classification of defamatory text in Thai language. Several approaches to text classification are used to analyze textual comments to political news and articles on Facebook, including word n-grams, character ngrams, specific terms, grammatical dependency structure, and sentiment polarity. The experiment is conducted using two machine learning methods with several combination of the approaches. The result shows that SVM performed better than Na{\"i}ve Bayes, and word n-grams and character n-grams are more efficient than other approaches with F score of 0.64 and accuracy of 0.74. In addition, dependency structure, specific terms, and sentiment polarity perform quite well with precision of 0.65 and accuracy of 0.66, but with lower recall rate of 0.35. We discuss linguistic variations in Thai language which affect the performance of the methods.",

keywords = "Text classification, Machine learning, social media, Defamation",

author = "Ratchakrit Arreerard and Twittie Senivongse",

year = "2018",

month = jul,

day = "12",

doi = "10.1109/BCD2018.2018.00019",

language = "English",

isbn = "978-1-5386-5606-8",

pages = "73--78",

booktitle = "Proceedings of the 2018 IEEE International Conference on Big Data, Cloud Computing, Data Science Engineering",

}

RIS

TY - GEN

T1 - Thai Defamatory Text Classification on Social Media

AU - Arreerard, Ratchakrit

AU - Senivongse, Twittie

PY - 2018/7/12

Y1 - 2018/7/12

N2 - Development of social media has brought a huge change to social communities in several aspects. They offer a place where social media users can post information, express opinions, and share interests. However, some information and opinions may cause a negative impact on the person mentioned in the post and that person can become a target of defamation. In Thailand, although defaming someone on social media is illegal, most social media users are not aware of it. To raise awareness of this issue, this paper proposes the classification of defamatory text in Thai language. Several approaches to text classification are used to analyze textual comments to political news and articles on Facebook, including word n-grams, character ngrams, specific terms, grammatical dependency structure, and sentiment polarity. The experiment is conducted using two machine learning methods with several combination of the approaches. The result shows that SVM performed better than Naïve Bayes, and word n-grams and character n-grams are more efficient than other approaches with F score of 0.64 and accuracy of 0.74. In addition, dependency structure, specific terms, and sentiment polarity perform quite well with precision of 0.65 and accuracy of 0.66, but with lower recall rate of 0.35. We discuss linguistic variations in Thai language which affect the performance of the methods.

AB - Development of social media has brought a huge change to social communities in several aspects. They offer a place where social media users can post information, express opinions, and share interests. However, some information and opinions may cause a negative impact on the person mentioned in the post and that person can become a target of defamation. In Thailand, although defaming someone on social media is illegal, most social media users are not aware of it. To raise awareness of this issue, this paper proposes the classification of defamatory text in Thai language. Several approaches to text classification are used to analyze textual comments to political news and articles on Facebook, including word n-grams, character ngrams, specific terms, grammatical dependency structure, and sentiment polarity. The experiment is conducted using two machine learning methods with several combination of the approaches. The result shows that SVM performed better than Naïve Bayes, and word n-grams and character n-grams are more efficient than other approaches with F score of 0.64 and accuracy of 0.74. In addition, dependency structure, specific terms, and sentiment polarity perform quite well with precision of 0.65 and accuracy of 0.66, but with lower recall rate of 0.35. We discuss linguistic variations in Thai language which affect the performance of the methods.

KW - Text classification

KW - Machine learning

KW - social media

KW - Defamation

U2 - 10.1109/BCD2018.2018.00019

DO - 10.1109/BCD2018.2018.00019

M3 - Conference contribution/Paper

SN - 978-1-5386-5606-8

SP - 73

EP - 78

BT - Proceedings of the 2018 IEEE International Conference on Big Data, Cloud Computing, Data Science Engineering

CY - Yonago, Japan

ER -

Research

Associated organisational unit

Text available via DOI:

Keywords