Final published version, 218 KB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License
Final published version
Licence: CC BY: Creative Commons Attribution 4.0 International License
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - The BEA 2024 Shared Task on the Multilingual Lexical Simplification Pipeline
AU - Shardlow, Matthew
AU - Alva-Manchego, Fernando
AU - Batista-Navarro, Riza Theresa
AU - Bott, Stefan
AU - Calderon Ramirez, Saul
AU - Cardon, Rémi
AU - François, Thomas
AU - Hayakawa, Akio
AU - Horbach, Andrea
AU - Hülsing, Anna
AU - Imperia, Joseph Marvin
AU - Nohej, Adam
AU - Ide, Yusuke
AU - North, Kai
AU - Occhipinti, Laura
AU - Rojas, Nelson Peréz
AU - Raihan, Md Nishat
AU - Ranasinghe, Tharindu
AU - Salazar, Martin Solis
AU - Štajner, Sanja
AU - Zampieri, Marcos
AU - Saggion, Horacio
PY - 2024/6/20
Y1 - 2024/6/20
N2 - We report the findings of the 2024 Multilingual Lexical Simplification Pipeline shared task. We released a new dataset comprising 5,927 instances of lexical complexity prediction and lexical simplification on common contexts across 10 languages, split into trial (300) and test (5,627). 10 teams participated across 2 tracks and 10 languages with 233 runs evaluated across all systems. Five teams participated in all languages for the lexical complexity prediction task and 4 teams participated in all languages for the lexical simplification task. Teams employed a range of strategies, making use of open and closed source large language models for lexical simplification, as well as feature-based approaches for lexical complexity prediction. The highest scoring team on the combined multilingual data was able to obtain a Pearson’s correlation of 0.6241 and an ACC@1@Top1 of 0.3772, both demonstrating that there is still room for improvement on two difficult sub-tasks of the lexical simplification pipeline.
AB - We report the findings of the 2024 Multilingual Lexical Simplification Pipeline shared task. We released a new dataset comprising 5,927 instances of lexical complexity prediction and lexical simplification on common contexts across 10 languages, split into trial (300) and test (5,627). 10 teams participated across 2 tracks and 10 languages with 233 runs evaluated across all systems. Five teams participated in all languages for the lexical complexity prediction task and 4 teams participated in all languages for the lexical simplification task. Teams employed a range of strategies, making use of open and closed source large language models for lexical simplification, as well as feature-based approaches for lexical complexity prediction. The highest scoring team on the combined multilingual data was able to obtain a Pearson’s correlation of 0.6241 and an ACC@1@Top1 of 0.3772, both demonstrating that there is still room for improvement on two difficult sub-tasks of the lexical simplification pipeline.
M3 - Conference contribution/Paper
SP - 571
EP - 589
BT - Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)
PB - Association for Computational Linguistics
CY - Kerrville
T2 - The 19th Workshop on Innovative Use of NLP for Building Educational Applications
Y2 - 20 June 2024 through 20 June 2024
ER -