Final published version, 526 KB, PDF document
Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License
Final published version
Licence: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - Survey on Thai NLP Language Resources and Tools
AU - Arreerard, Ratchakrit
AU - Mander, Stephen
AU - Piao, Scott
PY - 2022/6/16
Y1 - 2022/6/16
N2 - Over the past decades, Natural Language Processing (NLP) research has been expanding to cover more languages. Recently particularly, NLP community has paid increasing attention to under-resourced languages. However, there are still many languages for which NLP research is limited in terms of both language resources and software tools. Thai language is one of the under-resourced languages in the NLP domain, although it is spoken by nearly 70 million people globally. In this paper, we report on our survey on the past development of Thai NLP research to help understand its current state and future research directions. Our survey shows that, although Thai NLP community has achieved a significant achievement over the past three decades, particularly on NLP upstream tasks such as tokenisation, research on downstream tasks such as syntactic parsing and semantic analysis is still limited. But we foresee that Thai NLP research willadvance rapidly as richer Thai language resources and more robust NLP techniques become available.
AB - Over the past decades, Natural Language Processing (NLP) research has been expanding to cover more languages. Recently particularly, NLP community has paid increasing attention to under-resourced languages. However, there are still many languages for which NLP research is limited in terms of both language resources and software tools. Thai language is one of the under-resourced languages in the NLP domain, although it is spoken by nearly 70 million people globally. In this paper, we report on our survey on the past development of Thai NLP research to help understand its current state and future research directions. Our survey shows that, although Thai NLP community has achieved a significant achievement over the past three decades, particularly on NLP upstream tasks such as tokenisation, research on downstream tasks such as syntactic parsing and semantic analysis is still limited. But we foresee that Thai NLP research willadvance rapidly as richer Thai language resources and more robust NLP techniques become available.
KW - Natural Language Processing
KW - Thai NLP
KW - Survey
KW - Language Resource
KW - NLP tools
M3 - Conference contribution/Paper
SP - 6495
EP - 6505
BT - Language Resources and Evaluation Conference LREC 2022 Proceedings
PB - European Language Resources Association (ELRA)
CY - Paris
T2 - The 13th Edition of Language Resources and Evaluation Conference
Y2 - 21 June 2022 through 23 June 2022
ER -