Home > Research > Publications & Outputs > The BEA 2024 Shared Task on the Multilingual Le...

Electronic data

  • 2024.bea-1.51

    Final published version, 218 KB, PDF document

    Available under license: CC BY: Creative Commons Attribution 4.0 International License

Links

View graph of relations

The BEA 2024 Shared Task on the Multilingual Lexical Simplification Pipeline

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published
  • Matthew Shardlow
  • Fernando Alva-Manchego
  • Riza Theresa Batista-Navarro
  • Stefan Bott
  • Saul Calderon Ramirez
  • Rémi Cardon
  • Thomas François
  • Akio Hayakawa
  • Andrea Horbach
  • Anna Hülsing
  • Joseph Marvin Imperia
  • Adam Nohej
  • Yusuke Ide
  • Kai North
  • Laura Occhipinti
  • Nelson Peréz Rojas
  • Md Nishat Raihan
  • Martin Solis Salazar
  • Sanja Štajner
  • Marcos Zampieri
  • Horacio Saggion
Close
Publication date20/06/2024
Host publicationProceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024)
Place of PublicationKerrville
PublisherAssociation for Computational Linguistics
Pages571-589
Number of pages19
ISBN (electronic)9798891761001
<mark>Original language</mark>English
EventThe 19th Workshop on Innovative Use of NLP for Building Educational Applications -
Duration: 20/06/202420/06/2024

Workshop

WorkshopThe 19th Workshop on Innovative Use of NLP for Building Educational Applications
Abbreviated title(BEA 2024)
Period20/06/2420/06/24

Workshop

WorkshopThe 19th Workshop on Innovative Use of NLP for Building Educational Applications
Abbreviated title(BEA 2024)
Period20/06/2420/06/24

Abstract

We report the findings of the 2024 Multilingual Lexical Simplification Pipeline shared task. We released a new dataset comprising 5,927 instances of lexical complexity prediction and lexical simplification on common contexts across 10 languages, split into trial (300) and test (5,627). 10 teams participated across 2 tracks and 10 languages with 233 runs evaluated across all systems. Five teams participated in all languages for the lexical complexity prediction task and 4 teams participated in all languages for the lexical simplification task. Teams employed a range of strategies, making use of open and closed source large language models for lexical simplification, as well as feature-based approaches for lexical complexity prediction. The highest scoring team on the combined multilingual data was able to obtain a Pearson’s correlation of 0.6241 and an ACC@1@Top1 of 0.3772, both demonstrating that there is still room for improvement on two difficult sub-tasks of the lexical simplification pipeline.