Home > Research > Publications & Outputs > Lexical coverage evaluation of large-scale mult...

Electronic data

  • 257_lrec2016_final_full_paper

    Rights statement: The LREC 2016 Proceedings are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

    Final published version, 505 KB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Links

View graph of relations

Lexical coverage evaluation of large-scale multilingual semantic lexicons for twelve languages

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

Lexical coverage evaluation of large-scale multilingual semantic lexicons for twelve languages. / Piao, Scott Songlin; Rayson, Paul Edward; Archer, Dawn; Bianchi, Francesca; Dayrell , Carmen; El-Haj, Mahmoud; Jiménez, Ricardo-María; Knight, Dawn; Křen, Michal; Lofberg, Laura; Nawab, Rao Muhammad Adeel ; Shafi, Jawad; Teh, Phoey Lee; Mudraya, Olga.

LREC 2016, Tenth International Conference on Language Resources and Evaluation. ed. / Nicoletta Calzolari; Khalid Choukri; Thierry Declerck; Marko Grobelnik; Bente Maegaard; Joseph Mariani; Asuncion Moreno; Jan Odijk; Stelios Piperidis. European Language Resources Association (ELRA), 2016. p. 2614-2619.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Piao, SS, Rayson, PE, Archer, D, Bianchi, F, Dayrell , C, El-Haj, M, Jiménez, R-M, Knight, D, Křen, M, Lofberg, L, Nawab, RMA, Shafi, J, Teh, PL & Mudraya, O 2016, Lexical coverage evaluation of large-scale multilingual semantic lexicons for twelve languages. in N Calzolari, K Choukri, T Declerck, M Grobelnik, B Maegaard, J Mariani, A Moreno, J Odijk & S Piperidis (eds), LREC 2016, Tenth International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA), pp. 2614-2619, 10th edition of the Language Resources and Evaluation Conference (LREC2016), Portorož, Slovenia, 23/05/16. <http://www.lrec-conf.org/proceedings/lrec2016/pdf/257_Paper.pdf>

APA

Piao, S. S., Rayson, P. E., Archer, D., Bianchi, F., Dayrell , C., El-Haj, M., Jiménez, R-M., Knight, D., Křen, M., Lofberg, L., Nawab, R. M. A., Shafi, J., Teh, P. L., & Mudraya, O. (2016). Lexical coverage evaluation of large-scale multilingual semantic lexicons for twelve languages. In N. Calzolari, K. Choukri, T. Declerck, M. Grobelnik, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), LREC 2016, Tenth International Conference on Language Resources and Evaluation (pp. 2614-2619). European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2016/pdf/257_Paper.pdf

Vancouver

Piao SS, Rayson PE, Archer D, Bianchi F, Dayrell C, El-Haj M et al. Lexical coverage evaluation of large-scale multilingual semantic lexicons for twelve languages. In Calzolari N, Choukri K, Declerck T, Grobelnik M, Maegaard B, Mariani J, Moreno A, Odijk J, Piperidis S, editors, LREC 2016, Tenth International Conference on Language Resources and Evaluation. European Language Resources Association (ELRA). 2016. p. 2614-2619

Author

Piao, Scott Songlin ; Rayson, Paul Edward ; Archer, Dawn ; Bianchi, Francesca ; Dayrell , Carmen ; El-Haj, Mahmoud ; Jiménez, Ricardo-María ; Knight, Dawn ; Křen, Michal ; Lofberg, Laura ; Nawab, Rao Muhammad Adeel ; Shafi, Jawad ; Teh, Phoey Lee ; Mudraya, Olga. / Lexical coverage evaluation of large-scale multilingual semantic lexicons for twelve languages. LREC 2016, Tenth International Conference on Language Resources and Evaluation. editor / Nicoletta Calzolari ; Khalid Choukri ; Thierry Declerck ; Marko Grobelnik ; Bente Maegaard ; Joseph Mariani ; Asuncion Moreno ; Jan Odijk ; Stelios Piperidis. European Language Resources Association (ELRA), 2016. pp. 2614-2619

Bibtex

@inproceedings{71b6ec06089442b383a6a232ad20c191,
title = "Lexical coverage evaluation of large-scale multilingual semantic lexicons for twelve languages",
abstract = "The last two decades have seen the development of various semantic lexical resources such as WordNet (Miller, 1995) and the USAS semantic lexicon (Rayson et al., 2004), which have played an important role in the areas of natural language processing and corpus-based studies. Recently, increasing efforts have been devoted to extending the semantic frameworks of existing lexicalknowledge resources to cover more languages, such as EuroWordNet and Global WordNet. In this paper, we report on the construction of large-scale multilingual semantic lexicons for twelve languages, which employ the unified Lancaster semantic taxonomy and provide a multilingual lexical knowledge base for the automatic UCREL semantic annotation system (USAS). Our work contributes towards the goal of constructing larger-scale and higher-quality multilingual semantic lexical resources and developing corpus annotation tools based on them. Lexical coverage is an important factor concerning the quality of the lexicons and the performance of the corpus annotation tools, and in this experiment we focus on evaluating the lexical coverage achieved by the multilingual lexicons and semantic annotation tools based on them. Our evaluation shows that some semantic lexicons such as those for Finnish and Italian have achieved lexical coverage of over 90% while others need further expansion.",
keywords = "Semantic Lexicon, semantic annotation tool, multilingual lexicon",
author = "Piao, {Scott Songlin} and Rayson, {Paul Edward} and Dawn Archer and Francesca Bianchi and Carmen Dayrell and Mahmoud El-Haj and Ricardo-Mar{\'i}a Jim{\'e}nez and Dawn Knight and Michal K{\v r}en and Laura Lofberg and Nawab, {Rao Muhammad Adeel} and Jawad Shafi and Teh, {Phoey Lee} and Olga Mudraya",
note = "The LREC 2016 Proceedings are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License; 10th edition of the Language Resources and Evaluation Conference (LREC2016) ; Conference date: 23-05-2016 Through 28-05-2016",
year = "2016",
month = may,
day = "23",
language = "English",
isbn = "9782951740891",
pages = "2614--2619",
editor = "Nicoletta Calzolari and Khalid Choukri and Thierry Declerck and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis",
booktitle = "LREC 2016, Tenth International Conference on Language Resources and Evaluation",
publisher = "European Language Resources Association (ELRA)",

}

RIS

TY - GEN

T1 - Lexical coverage evaluation of large-scale multilingual semantic lexicons for twelve languages

AU - Piao, Scott Songlin

AU - Rayson, Paul Edward

AU - Archer, Dawn

AU - Bianchi, Francesca

AU - Dayrell , Carmen

AU - El-Haj, Mahmoud

AU - Jiménez, Ricardo-María

AU - Knight, Dawn

AU - Křen, Michal

AU - Lofberg, Laura

AU - Nawab, Rao Muhammad Adeel

AU - Shafi, Jawad

AU - Teh, Phoey Lee

AU - Mudraya, Olga

N1 - The LREC 2016 Proceedings are licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

PY - 2016/5/23

Y1 - 2016/5/23

N2 - The last two decades have seen the development of various semantic lexical resources such as WordNet (Miller, 1995) and the USAS semantic lexicon (Rayson et al., 2004), which have played an important role in the areas of natural language processing and corpus-based studies. Recently, increasing efforts have been devoted to extending the semantic frameworks of existing lexicalknowledge resources to cover more languages, such as EuroWordNet and Global WordNet. In this paper, we report on the construction of large-scale multilingual semantic lexicons for twelve languages, which employ the unified Lancaster semantic taxonomy and provide a multilingual lexical knowledge base for the automatic UCREL semantic annotation system (USAS). Our work contributes towards the goal of constructing larger-scale and higher-quality multilingual semantic lexical resources and developing corpus annotation tools based on them. Lexical coverage is an important factor concerning the quality of the lexicons and the performance of the corpus annotation tools, and in this experiment we focus on evaluating the lexical coverage achieved by the multilingual lexicons and semantic annotation tools based on them. Our evaluation shows that some semantic lexicons such as those for Finnish and Italian have achieved lexical coverage of over 90% while others need further expansion.

AB - The last two decades have seen the development of various semantic lexical resources such as WordNet (Miller, 1995) and the USAS semantic lexicon (Rayson et al., 2004), which have played an important role in the areas of natural language processing and corpus-based studies. Recently, increasing efforts have been devoted to extending the semantic frameworks of existing lexicalknowledge resources to cover more languages, such as EuroWordNet and Global WordNet. In this paper, we report on the construction of large-scale multilingual semantic lexicons for twelve languages, which employ the unified Lancaster semantic taxonomy and provide a multilingual lexical knowledge base for the automatic UCREL semantic annotation system (USAS). Our work contributes towards the goal of constructing larger-scale and higher-quality multilingual semantic lexical resources and developing corpus annotation tools based on them. Lexical coverage is an important factor concerning the quality of the lexicons and the performance of the corpus annotation tools, and in this experiment we focus on evaluating the lexical coverage achieved by the multilingual lexicons and semantic annotation tools based on them. Our evaluation shows that some semantic lexicons such as those for Finnish and Italian have achieved lexical coverage of over 90% while others need further expansion.

KW - Semantic Lexicon

KW - semantic annotation tool

KW - multilingual lexicon

UR - http://lrec2016.lrec-conf.org/en/

M3 - Conference contribution/Paper

SN - 9782951740891

SP - 2614

EP - 2619

BT - LREC 2016, Tenth International Conference on Language Resources and Evaluation

A2 - Calzolari, Nicoletta

A2 - Choukri, Khalid

A2 - Declerck, Thierry

A2 - Grobelnik, Marko

A2 - Maegaard, Bente

A2 - Mariani, Joseph

A2 - Moreno, Asuncion

A2 - Odijk, Jan

A2 - Piperidis, Stelios

PB - European Language Resources Association (ELRA)

T2 - 10th edition of the Language Resources and Evaluation Conference (LREC2016)

Y2 - 23 May 2016 through 28 May 2016

ER -