Final published version
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Chapter
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Chapter
}
TY - CHAP
T1 - Developing multilingual automatic semantic annotation systems
AU - Löfberg, Laura
AU - Rayson, Paul
PY - 2019/6/10
Y1 - 2019/6/10
N2 - We report the development of a multilingual system for the semantic analysis of text. The research on the English Semantic Tagger started in 1990, and after that the system has been ported, first, to Finnish and Russian, and, thereafter, to Arabic, Chinese, Czech, Dutch, French, Italian, Malay, Portuguese, Spanish, Urdu, and Welsh. The development processes of the semantic taggers for English, Finnish, and Russian were relatively similar, involving manual construction of the semantic lexicons, whereas, to speed up the research, new bootstrapping methods including computational approaches have been utilised later in the creation of the semantic lexicons for the other languages. We describe these manual and automatic processes as well as envisaging directions for future development. The resulting multilingual framework of semantic taggers based on equivalent semantic lexicons and one common semantic taxonomy offers a wealth of potential applications which this chapter also illustrates. In addition to developing monolingual applications for these semantic taggers, it is also possible to create cross-lingual and multilingual applications. Furthermore, while the existing semantic analysis systems are designed for the analysis of general language, such systems can also be tailored for a specific purpose to deal more accurately with only one particular domain or task.
AB - We report the development of a multilingual system for the semantic analysis of text. The research on the English Semantic Tagger started in 1990, and after that the system has been ported, first, to Finnish and Russian, and, thereafter, to Arabic, Chinese, Czech, Dutch, French, Italian, Malay, Portuguese, Spanish, Urdu, and Welsh. The development processes of the semantic taggers for English, Finnish, and Russian were relatively similar, involving manual construction of the semantic lexicons, whereas, to speed up the research, new bootstrapping methods including computational approaches have been utilised later in the creation of the semantic lexicons for the other languages. We describe these manual and automatic processes as well as envisaging directions for future development. The resulting multilingual framework of semantic taggers based on equivalent semantic lexicons and one common semantic taxonomy offers a wealth of potential applications which this chapter also illustrates. In addition to developing monolingual applications for these semantic taggers, it is also possible to create cross-lingual and multilingual applications. Furthermore, while the existing semantic analysis systems are designed for the analysis of general language, such systems can also be tailored for a specific purpose to deal more accurately with only one particular domain or task.
KW - Corpus linguistics
KW - Cross-lingual applications
KW - Domain-specific applications
KW - Multilingual applications
KW - Semantic annotation
U2 - 10.1017/9781108525695.006
DO - 10.1017/9781108525695.006
M3 - Chapter
AN - SCOPUS:85098039578
SN - 9781108423274
SP - 94
EP - 109
BT - Advances in Empirical Translation Studies
A2 - Ji, Meng
A2 - Oakes, Michael
PB - Cambridge University Press
CY - Cambridge
ER -