Home > Research > Publications & Outputs > Creating and validating multilingual semantic r...

Electronic data

  • W17-1908

    Final published version, 1.01 MB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

View graph of relations

Creating and validating multilingual semantic representations for six languages: expert versus non-expert crowds

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

Creating and validating multilingual semantic representations for six languages: expert versus non-expert crowds. / El-Haj, Mahmoud; Rayson, Paul; Piao, Scott et al.
Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 2017. p. 61-71.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

El-Haj, M, Rayson, P, Piao, S & Wattam, S 2017, Creating and validating multilingual semantic representations for six languages: expert versus non-expert crowds. in Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, pp. 61-71. <http://aclweb.org/anthology/W17-1908>

APA

El-Haj, M., Rayson, P., Piao, S., & Wattam, S. (2017). Creating and validating multilingual semantic representations for six languages: expert versus non-expert crowds. In Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (pp. 61-71). Association for Computational Linguistics. http://aclweb.org/anthology/W17-1908

Vancouver

El-Haj M, Rayson P, Piao S, Wattam S. Creating and validating multilingual semantic representations for six languages: expert versus non-expert crowds. In Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics. 2017. p. 61-71

Author

El-Haj, Mahmoud ; Rayson, Paul ; Piao, Scott et al. / Creating and validating multilingual semantic representations for six languages : expert versus non-expert crowds. Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 2017. pp. 61-71

Bibtex

@inproceedings{675fc4ab3efa463da82965575f9cadd5,
title = "Creating and validating multilingual semantic representations for six languages: expert versus non-expert crowds",
abstract = "Creating high-quality wide-coverage multilingual semantic lexicons to support knowledge-based approaches is a challenging time-consuming manual task. This has traditionally been performed by linguistic experts: a slow and expensive process. We present an experiment in which we adapt and evaluate crowdsourcing methods employing native speakers to generate a list of coarse-grained senses under a common multilingual semantic taxonomy for sets of words in six languages. 451 non-experts (including 427 Mechanical Turk workers) and 15 expert participants semantically annotated 250 words manually for Arabic, Chinese, English, Italian, Portuguese and Urdu lexicons. In order to avoid erroneous (spam) crowdsourced results, we used a novel taskspecific two-phase filtering process where users were asked to identify synonyms in the target language, and remove erroneous senses.",
author = "Mahmoud El-Haj and Paul Rayson and Scott Piao and Stephen Wattam",
year = "2017",
month = apr,
day = "3",
language = "English",
isbn = "9781945626500",
pages = "61--71",
booktitle = "Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications",
publisher = "Association for Computational Linguistics",

}

RIS

TY - GEN

T1 - Creating and validating multilingual semantic representations for six languages

T2 - expert versus non-expert crowds

AU - El-Haj, Mahmoud

AU - Rayson, Paul

AU - Piao, Scott

AU - Wattam, Stephen

PY - 2017/4/3

Y1 - 2017/4/3

N2 - Creating high-quality wide-coverage multilingual semantic lexicons to support knowledge-based approaches is a challenging time-consuming manual task. This has traditionally been performed by linguistic experts: a slow and expensive process. We present an experiment in which we adapt and evaluate crowdsourcing methods employing native speakers to generate a list of coarse-grained senses under a common multilingual semantic taxonomy for sets of words in six languages. 451 non-experts (including 427 Mechanical Turk workers) and 15 expert participants semantically annotated 250 words manually for Arabic, Chinese, English, Italian, Portuguese and Urdu lexicons. In order to avoid erroneous (spam) crowdsourced results, we used a novel taskspecific two-phase filtering process where users were asked to identify synonyms in the target language, and remove erroneous senses.

AB - Creating high-quality wide-coverage multilingual semantic lexicons to support knowledge-based approaches is a challenging time-consuming manual task. This has traditionally been performed by linguistic experts: a slow and expensive process. We present an experiment in which we adapt and evaluate crowdsourcing methods employing native speakers to generate a list of coarse-grained senses under a common multilingual semantic taxonomy for sets of words in six languages. 451 non-experts (including 427 Mechanical Turk workers) and 15 expert participants semantically annotated 250 words manually for Arabic, Chinese, English, Italian, Portuguese and Urdu lexicons. In order to avoid erroneous (spam) crowdsourced results, we used a novel taskspecific two-phase filtering process where users were asked to identify synonyms in the target language, and remove erroneous senses.

M3 - Conference contribution/Paper

SN - 9781945626500

SP - 61

EP - 71

BT - Proceedings of the 1st Workshop on Sense, Concept and Entity Representations and their Applications

PB - Association for Computational Linguistics

ER -