Home > Research > Publications & Outputs > An Extensible Massively Multilingual Lexical Si...

Electronic data

  • 2024.readi-1.4

    Final published version, 195 KB, PDF document

    Available under license: CC BY-NC: Creative Commons Attribution-NonCommercial 4.0 International License

Links

View graph of relations

An Extensible Massively Multilingual Lexical Simplification Pipeline Dataset using the MultiLS Framework

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

An Extensible Massively Multilingual Lexical Simplification Pipeline Dataset using the MultiLS Framework. / Shardlow, Matthew; Alva-Manchego, Fernando ; Batista-Navarro, Riza Theresa et al.
Proceedings of the 3rd Workshop on Tools and Resources for People with REAding DIfficulties (READI) @ LREC-COLING 2024. ed. / Rodrigo Wilkens; Rémi Cardon; Amalia Todirascu; Núria Gala. ELRA and ICCL, 2024. p. 38-46.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Shardlow, M, Alva-Manchego, F, Batista-Navarro, RT, Bott, S, Calderon Ramirez, S, Cardon, R, François, T, Hayakawa, A, Horbach, A, Hülsing, A, Ide, Y, Imperia, JM, Nohej, A, North, K, Occhipinti, L, Rojas, NP, Raihan, MN, Ranasinghe, T, Salazar, MS, Zampieri, M & Saggion, H 2024, An Extensible Massively Multilingual Lexical Simplification Pipeline Dataset using the MultiLS Framework. in R Wilkens, R Cardon, A Todirascu & N Gala (eds), Proceedings of the 3rd Workshop on Tools and Resources for People with REAding DIfficulties (READI) @ LREC-COLING 2024. ELRA and ICCL, pp. 38-46, 3rd Workshop on Tools and Resources for REAding DIfficulties, Turin, Italy, 20/05/24. <https://aclanthology.org/2024.readi-1.0/>

APA

Shardlow, M., Alva-Manchego, F., Batista-Navarro, R. T., Bott, S., Calderon Ramirez, S., Cardon, R., François, T., Hayakawa, A., Horbach, A., Hülsing, A., Ide, Y., Imperia, J. M., Nohej, A., North, K., Occhipinti, L., Rojas, N. P., Raihan, M. N., Ranasinghe, T., Salazar, M. S., ... Saggion, H. (2024). An Extensible Massively Multilingual Lexical Simplification Pipeline Dataset using the MultiLS Framework. In R. Wilkens, R. Cardon, A. Todirascu, & N. Gala (Eds.), Proceedings of the 3rd Workshop on Tools and Resources for People with REAding DIfficulties (READI) @ LREC-COLING 2024 (pp. 38-46). ELRA and ICCL. https://aclanthology.org/2024.readi-1.0/

Vancouver

Shardlow M, Alva-Manchego F, Batista-Navarro RT, Bott S, Calderon Ramirez S, Cardon R et al. An Extensible Massively Multilingual Lexical Simplification Pipeline Dataset using the MultiLS Framework. In Wilkens R, Cardon R, Todirascu A, Gala N, editors, Proceedings of the 3rd Workshop on Tools and Resources for People with REAding DIfficulties (READI) @ LREC-COLING 2024. ELRA and ICCL. 2024. p. 38-46

Author

Shardlow, Matthew ; Alva-Manchego, Fernando ; Batista-Navarro, Riza Theresa et al. / An Extensible Massively Multilingual Lexical Simplification Pipeline Dataset using the MultiLS Framework. Proceedings of the 3rd Workshop on Tools and Resources for People with REAding DIfficulties (READI) @ LREC-COLING 2024. editor / Rodrigo Wilkens ; Rémi Cardon ; Amalia Todirascu ; Núria Gala. ELRA and ICCL, 2024. pp. 38-46

Bibtex

@inproceedings{10099c13a14a482e984410980769d99a,
title = "An Extensible Massively Multilingual Lexical Simplification Pipeline Dataset using the MultiLS Framework",
abstract = "We present preliminary findings on the MultiLS dataset, developed in support of the 2024 Multilingual Lexical Simplification Pipeline (MLSP) Shared Task. This dataset currently comprises of 300 instances of lexical complexity prediction and lexical simplification across 10 languages. In this paper, we (1) describe the annotation protocol in support of the contribution of future datasets and (2) present summary statistics on the existing data that we have gathered. Multilingual lexical simplification can be used to support low-ability readers to engage with otherwisedifficult texts in their native, often low-resourced, languages.",
author = "Matthew Shardlow and Fernando Alva-Manchego and Batista-Navarro, {Riza Theresa} and Stefan Bott and {Calderon Ramirez}, Saul and R{\'e}mi Cardon and Thomas Fran{\c c}ois and Akio Hayakawa and Andrea Horbach and Anna H{\"u}lsing and Yusuke Ide and Imperia, {Joseph Marvin} and Adam Nohej and Kai North and Laura Occhipinti and Rojas, {Nelson Per{\'e}z} and Raihan, {Md Nishat} and Tharindu Ranasinghe and Salazar, {Martin Solis} and Marcos Zampieri and Horacio Saggion",
year = "2024",
month = may,
day = "20",
language = "English",
pages = "38--46",
editor = "Rodrigo Wilkens and R{\'e}mi Cardon and Amalia Todirascu and N{\'u}ria Gala",
booktitle = "Proceedings of the 3rd Workshop on Tools and Resources for People with REAding DIfficulties (READI) @ LREC-COLING 2024",
publisher = "ELRA and ICCL",
note = "3rd Workshop on Tools and Resources for REAding DIfficulties, READI ; Conference date: 20-05-2024 Through 20-05-2024",
url = "https://cental.uclouvain.be/readi2024/",

}

RIS

TY - GEN

T1 - An Extensible Massively Multilingual Lexical Simplification Pipeline Dataset using the MultiLS Framework

AU - Shardlow, Matthew

AU - Alva-Manchego, Fernando

AU - Batista-Navarro, Riza Theresa

AU - Bott, Stefan

AU - Calderon Ramirez, Saul

AU - Cardon, Rémi

AU - François, Thomas

AU - Hayakawa, Akio

AU - Horbach, Andrea

AU - Hülsing, Anna

AU - Ide, Yusuke

AU - Imperia, Joseph Marvin

AU - Nohej, Adam

AU - North, Kai

AU - Occhipinti, Laura

AU - Rojas, Nelson Peréz

AU - Raihan, Md Nishat

AU - Ranasinghe, Tharindu

AU - Salazar, Martin Solis

AU - Zampieri, Marcos

AU - Saggion, Horacio

PY - 2024/5/20

Y1 - 2024/5/20

N2 - We present preliminary findings on the MultiLS dataset, developed in support of the 2024 Multilingual Lexical Simplification Pipeline (MLSP) Shared Task. This dataset currently comprises of 300 instances of lexical complexity prediction and lexical simplification across 10 languages. In this paper, we (1) describe the annotation protocol in support of the contribution of future datasets and (2) present summary statistics on the existing data that we have gathered. Multilingual lexical simplification can be used to support low-ability readers to engage with otherwisedifficult texts in their native, often low-resourced, languages.

AB - We present preliminary findings on the MultiLS dataset, developed in support of the 2024 Multilingual Lexical Simplification Pipeline (MLSP) Shared Task. This dataset currently comprises of 300 instances of lexical complexity prediction and lexical simplification across 10 languages. In this paper, we (1) describe the annotation protocol in support of the contribution of future datasets and (2) present summary statistics on the existing data that we have gathered. Multilingual lexical simplification can be used to support low-ability readers to engage with otherwisedifficult texts in their native, often low-resourced, languages.

M3 - Conference contribution/Paper

SP - 38

EP - 46

BT - Proceedings of the 3rd Workshop on Tools and Resources for People with REAding DIfficulties (READI) @ LREC-COLING 2024

A2 - Wilkens, Rodrigo

A2 - Cardon, Rémi

A2 - Todirascu, Amalia

A2 - Gala, Núria

PB - ELRA and ICCL

T2 - 3rd Workshop on Tools and Resources for REAding DIfficulties

Y2 - 20 May 2024 through 20 May 2024

ER -