A Comparative Study of Evaluation Metrics for Long-Document Financial Narrative Summarization with Transformers

Associated organisational units

Electronic data

A Comparative Study of Evaluation Metrics for Long-Document Financial Narrative Summarization with Transformers
Accepted author manuscript, 216 KB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Text available via DOI:

https://doi.org/10.1007/978-3-031-35320-8_28
Final published version

Keywords

Benchmarking, Evaluation Metrics, Long Document sumamrization

View graph of relations

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

More...

Publication date	21/06/2023
Host publication	Natural Language Processing and Information Systems - 28th International Conference on Applications of Natural Language to Information Systems, NLDB 2023, Proceedings
Editors	Elisabeth Métais, Farid Meziane, Warren Manning, Stephan Reiff-Marganiec, Vijayan Sugumaran
Place of Publication	Cham
Publisher	Springer
Pages	391-403
Number of pages	13
ISBN (print)	9783031353192
<mark>Original language</mark>	English
Event	28th International Conference on Applications of Natural Language to Information Systems, NLDB 2023 - Derby, United Kingdom Duration: 21/06/2023 → 23/06/2023

Conference

Conference	28th International Conference on Applications of Natural Language to Information Systems, NLDB 2023
Country/Territory	United Kingdom
City	Derby
Period	21/06/23 → 23/06/23

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	13913 LNCS
ISSN (Print)	0302-9743
ISSN (electronic)	1611-3349

Conference

Conference	28th International Conference on Applications of Natural Language to Information Systems, NLDB 2023
Country/Territory	United Kingdom
City	Derby
Period	21/06/23 → 23/06/23

Abstract

There are more than 2,000 listed companies on the UK’s London Stock Exchange, divided into 11 sectors who are required to communicate their financial results at least twice in a single financial year. UK annual reports are very lengthy documents with around 80 pages on average. In this study, we aim to benchmark a variety of summarisation methods on a set of different pre-trained transformers with different extraction techniques. In addition, we considered multiple evaluation metrics in order to investigate their differing behaviour and applicability on a dataset from the Financial Narrative Summarisation (FNS 2020) shared task, which is composed of annual reports published by firms listed on the London Stock Exchange and their corresponding summaries. We hypothesise that some evaluation metrics do not reflect true summarisation ability and propose a novel BRUGEscore metric, as the harmonic mean of ROUGE-2 and BERTscore. Finally, we perform a statistical significance test on our results to verify whether they are statistically robust, alongside an adversarial analysis task with three different corruption methods.

Research

Associated organisational units

Electronic data

Links

Text available via DOI:

Keywords