Home > Research > Publications & Outputs > The Financial Document Structure Extraction Sha...

Links

View graph of relations

The Financial Document Structure Extraction Shared Task (FinTOC 2022)

Research output: Contribution to conference - Without ISBN/ISSN Conference paperpeer-review

Published

Standard

The Financial Document Structure Extraction Shared Task (FinTOC 2022). / El-Haj, Mahmoud; Kang, Juyeon; Azzi, Abderrahim Ait et al.
2022. 92-97 Paper presented at The 4th Financial Narrative Processing Workshop, Marseille, France.

Research output: Contribution to conference - Without ISBN/ISSN Conference paperpeer-review

Harvard

El-Haj, M, Kang, J, Azzi, AA, Bellato, S, El Maarouf, I, Gan, M, Gisbert, A & Sandoval, A 2022, 'The Financial Document Structure Extraction Shared Task (FinTOC 2022)', Paper presented at The 4th Financial Narrative Processing Workshop, Marseille, France, 24/06/22 - 24/06/22 pp. 92-97. <http://www.lrec-conf.org/proceedings/lrec2022/workshops/FNP/pdf/2022.fnp-1.13.pdf>

APA

El-Haj, M., Kang, J., Azzi, A. A., Bellato, S., El Maarouf, I., Gan, M., Gisbert, A., & Sandoval, A. (2022). The Financial Document Structure Extraction Shared Task (FinTOC 2022). 92-97. Paper presented at The 4th Financial Narrative Processing Workshop, Marseille, France. http://www.lrec-conf.org/proceedings/lrec2022/workshops/FNP/pdf/2022.fnp-1.13.pdf

Vancouver

El-Haj M, Kang J, Azzi AA, Bellato S, El Maarouf I, Gan M et al.. The Financial Document Structure Extraction Shared Task (FinTOC 2022). 2022. Paper presented at The 4th Financial Narrative Processing Workshop, Marseille, France.

Author

El-Haj, Mahmoud ; Kang, Juyeon ; Azzi, Abderrahim Ait et al. / The Financial Document Structure Extraction Shared Task (FinTOC 2022). Paper presented at The 4th Financial Narrative Processing Workshop, Marseille, France.5 p.

Bibtex

@conference{80698af151a549cfa16a94f032fe3a7f,
title = "The Financial Document Structure Extraction Shared Task (FinTOC 2022)",
abstract = "This paper describes the FinTOC-2022 Shared Task on the structure extraction from financial documents, its participants results and their findings. This shared task was organized as part of The 4th Financial Narrative Processing Workshop (FNP 2022), held jointly at The 13th Edition of the Language Resources and Evaluation Conference (LREC 2022), Marseille, France (El-Haj et al., 2022). This shared task aimed to stimulate research in systems for extracting table-of-contents (TOC) from investment documents (such as financial prospectuses) by detecting the document titles and organizing them hierarchically into a TOC. For the forth edition of this shared task, three subtasks were presented to the participants: one with English documents, one with French documents and the other one with Spanish documents. This year, we proposed a different and revised dataset for English and French compared to the previous editions of FinTOC and a new dataset for Spanish documents was added. The task attracted 6 submissions for each language from 4 teams, and the most successful methods make use of textual, structural and visual features extracted from the documents and propose classification models for detecting titles and TOCs for all of the subtasks.",
author = "Mahmoud El-Haj and Juyeon Kang and Azzi, {Abderrahim Ait} and Sandra Bellato and {El Maarouf}, Ismail and Mei Gan and Ana Gisbert and Antonio Sandoval",
year = "2022",
month = jun,
day = "15",
language = "English",
pages = "92--97",
note = "The 4th Financial Narrative Processing Workshop, FNP 2022 ; Conference date: 24-06-2022 Through 24-06-2022",
url = "http://wp.lancs.ac.uk/cfie/fnp2022/",

}

RIS

TY - CONF

T1 - The Financial Document Structure Extraction Shared Task (FinTOC 2022)

AU - El-Haj, Mahmoud

AU - Kang, Juyeon

AU - Azzi, Abderrahim Ait

AU - Bellato, Sandra

AU - El Maarouf, Ismail

AU - Gan, Mei

AU - Gisbert, Ana

AU - Sandoval, Antonio

N1 - Conference code: 4

PY - 2022/6/15

Y1 - 2022/6/15

N2 - This paper describes the FinTOC-2022 Shared Task on the structure extraction from financial documents, its participants results and their findings. This shared task was organized as part of The 4th Financial Narrative Processing Workshop (FNP 2022), held jointly at The 13th Edition of the Language Resources and Evaluation Conference (LREC 2022), Marseille, France (El-Haj et al., 2022). This shared task aimed to stimulate research in systems for extracting table-of-contents (TOC) from investment documents (such as financial prospectuses) by detecting the document titles and organizing them hierarchically into a TOC. For the forth edition of this shared task, three subtasks were presented to the participants: one with English documents, one with French documents and the other one with Spanish documents. This year, we proposed a different and revised dataset for English and French compared to the previous editions of FinTOC and a new dataset for Spanish documents was added. The task attracted 6 submissions for each language from 4 teams, and the most successful methods make use of textual, structural and visual features extracted from the documents and propose classification models for detecting titles and TOCs for all of the subtasks.

AB - This paper describes the FinTOC-2022 Shared Task on the structure extraction from financial documents, its participants results and their findings. This shared task was organized as part of The 4th Financial Narrative Processing Workshop (FNP 2022), held jointly at The 13th Edition of the Language Resources and Evaluation Conference (LREC 2022), Marseille, France (El-Haj et al., 2022). This shared task aimed to stimulate research in systems for extracting table-of-contents (TOC) from investment documents (such as financial prospectuses) by detecting the document titles and organizing them hierarchically into a TOC. For the forth edition of this shared task, three subtasks were presented to the participants: one with English documents, one with French documents and the other one with Spanish documents. This year, we proposed a different and revised dataset for English and French compared to the previous editions of FinTOC and a new dataset for Spanish documents was added. The task attracted 6 submissions for each language from 4 teams, and the most successful methods make use of textual, structural and visual features extracted from the documents and propose classification models for detecting titles and TOCs for all of the subtasks.

UR - https://aclanthology.org/2022.fnp-1.13

M3 - Conference paper

SP - 92

EP - 97

T2 - The 4th Financial Narrative Processing Workshop

Y2 - 24 June 2022 through 24 June 2022

ER -