Home > Research > Publications & Outputs > Development of the multilingual semantic annota...

Electronic data

  • N15-1137

    Final published version, 270 KB, PDF document

Links

View graph of relations

Development of the multilingual semantic annotation system

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published
Close
Publication date1/06/2015
Host publicationThe 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Proceedings of the Conference
PublisherAssociation for Computational Linguistics
Pages1268-1274
Number of pages7
ISBN (print)9781941643495
<mark>Original language</mark>English
EventThe 2015 Conference of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies (NAACL HLT 2015) - Denver, Colorado, United States
Duration: 31/05/20155/06/2015

Conference

ConferenceThe 2015 Conference of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies (NAACL HLT 2015)
Country/TerritoryUnited States
CityDenver, Colorado
Period31/05/155/06/15

Conference

ConferenceThe 2015 Conference of the North American Chapter of the Association for Computational Linguistics – Human Language Technologies (NAACL HLT 2015)
Country/TerritoryUnited States
CityDenver, Colorado
Period31/05/155/06/15

Abstract

This paper reports on our research to generate multilingual semantic lexical resources and develop multilingual semantic annotation software, which assigns each word in running text to a semantic category based on a lexical semantic classification scheme. Such tools have an important role in developing intelligent
multilingual NLP, text mining and ICT systems. In this work, we aim to extend an existing English semantic annotation tool to cover a range of languages, namely Italian, Chinese and Brazilian Portuguese, by bootstrapping new semantic lexical resources via automatically translating existing English semantic lexicons into these languages. We used a set of bilingual dictionaries and word lists for this purpose. In our experiment, with minor manual improvement of the automatically
generated semantic lexicons, the prototype tools based on the new lexicons achieved an average lexical coverage of 79.86% and an average annotation precision of 71.42% (if only precise annotations are considered) or 84.64% (if partially correct annotations are included) on the three languages. Our experiment demonstrates that it is feasible to rapidly develop prototype semantic annotation tools for new languages by automatically bootstrapping
new semantic lexicons based on existing ones.

Bibliographic note

Date of Acceptance: 20/02/2015