Home > Research > Publications & Outputs > CoFiF Plus

Links

View graph of relations

CoFiF Plus: A French Financial Narrative Summarisation Corpus

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

CoFiF Plus: A French Financial Narrative Summarisation Corpus. / Zmandar, Nadhem; Daudert, Tobias ; Ahmadi, Sina et al.
Language Resources and Evaluation (LREC 2022). ed. / Nicoletta Calzolari. Paris: European Language Resources Association (ELRA), 2022. p. 1622-1639.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Zmandar, N, Daudert, T, Ahmadi, S, El-Haj, M & Rayson, P 2022, CoFiF Plus: A French Financial Narrative Summarisation Corpus. in N Calzolari (ed.), Language Resources and Evaluation (LREC 2022). European Language Resources Association (ELRA), Paris, pp. 1622-1639, 13th Language Resources and Evaluation Conference, Marseille, France, 20/06/22. <http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.174.pdf>

APA

Zmandar, N., Daudert, T., Ahmadi, S., El-Haj, M., & Rayson, P. (2022). CoFiF Plus: A French Financial Narrative Summarisation Corpus. In N. Calzolari (Ed.), Language Resources and Evaluation (LREC 2022) (pp. 1622-1639). European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2022/pdf/2022.lrec-1.174.pdf

Vancouver

Zmandar N, Daudert T, Ahmadi S, El-Haj M, Rayson P. CoFiF Plus: A French Financial Narrative Summarisation Corpus. In Calzolari N, editor, Language Resources and Evaluation (LREC 2022). Paris: European Language Resources Association (ELRA). 2022. p. 1622-1639

Author

Zmandar, Nadhem ; Daudert, Tobias ; Ahmadi, Sina et al. / CoFiF Plus : A French Financial Narrative Summarisation Corpus. Language Resources and Evaluation (LREC 2022). editor / Nicoletta Calzolari. Paris : European Language Resources Association (ELRA), 2022. pp. 1622-1639

Bibtex

@inproceedings{f892c1960d4b4751a1c3ca13858c9abd,
title = "CoFiF Plus: A French Financial Narrative Summarisation Corpus",
abstract = "Natural Language Processing is increasingly being applied in the finance and business industry to analyse the text of many different types of financial documents. Given the increasing growth of firms around the world, the volume of financial disclosures and financial texts in different languages and forms is increasing sharply and therefore the study of language technology methods that automatically summarise content has grown rapidly into a major research area. Corpora for financial narrative summarisation exist in English, but there is a significant lack of financial text resources in the French language. To remedy this, we present CoFiF Plus, the first financial narrative summarisation dataset providing a comprehensive set of financial text written in the French language. The dataset has been extracted from french financial reports published in PDF file format. It is composed of 1,703 reports from the most capitalised companies in France (Euronext Paris) covering a time frame from 1995 to 2021. This paper describes the collection, annotation and validation of the financial reports and their summaries. It also describes the dataset and gives the results of some baseline summarisers.",
author = "Nadhem Zmandar and Tobias Daudert and Sina Ahmadi and Mahmoud El-Haj and Paul Rayson",
year = "2022",
month = jun,
day = "23",
language = "English",
pages = "1622--1639",
editor = "Nicoletta Calzolari",
booktitle = "Language Resources and Evaluation (LREC 2022)",
publisher = "European Language Resources Association (ELRA)",
note = "13th Language Resources and Evaluation Conference, LREC 2022 ; Conference date: 20-06-2022 Through 25-06-2022",
url = "https://lrec2022.lrec-conf.org/en/",

}

RIS

TY - GEN

T1 - CoFiF Plus

T2 - 13th Language Resources and Evaluation Conference

AU - Zmandar, Nadhem

AU - Daudert, Tobias

AU - Ahmadi, Sina

AU - El-Haj, Mahmoud

AU - Rayson, Paul

PY - 2022/6/23

Y1 - 2022/6/23

N2 - Natural Language Processing is increasingly being applied in the finance and business industry to analyse the text of many different types of financial documents. Given the increasing growth of firms around the world, the volume of financial disclosures and financial texts in different languages and forms is increasing sharply and therefore the study of language technology methods that automatically summarise content has grown rapidly into a major research area. Corpora for financial narrative summarisation exist in English, but there is a significant lack of financial text resources in the French language. To remedy this, we present CoFiF Plus, the first financial narrative summarisation dataset providing a comprehensive set of financial text written in the French language. The dataset has been extracted from french financial reports published in PDF file format. It is composed of 1,703 reports from the most capitalised companies in France (Euronext Paris) covering a time frame from 1995 to 2021. This paper describes the collection, annotation and validation of the financial reports and their summaries. It also describes the dataset and gives the results of some baseline summarisers.

AB - Natural Language Processing is increasingly being applied in the finance and business industry to analyse the text of many different types of financial documents. Given the increasing growth of firms around the world, the volume of financial disclosures and financial texts in different languages and forms is increasing sharply and therefore the study of language technology methods that automatically summarise content has grown rapidly into a major research area. Corpora for financial narrative summarisation exist in English, but there is a significant lack of financial text resources in the French language. To remedy this, we present CoFiF Plus, the first financial narrative summarisation dataset providing a comprehensive set of financial text written in the French language. The dataset has been extracted from french financial reports published in PDF file format. It is composed of 1,703 reports from the most capitalised companies in France (Euronext Paris) covering a time frame from 1995 to 2021. This paper describes the collection, annotation and validation of the financial reports and their summaries. It also describes the dataset and gives the results of some baseline summarisers.

M3 - Conference contribution/Paper

SP - 1622

EP - 1639

BT - Language Resources and Evaluation (LREC 2022)

A2 - Calzolari, Nicoletta

PB - European Language Resources Association (ELRA)

CY - Paris

Y2 - 20 June 2022 through 25 June 2022

ER -