Home > Research > Publications & Outputs > Towards a Multilingual Financial Narrative Proc...

Electronic data

Links

View graph of relations

Towards a Multilingual Financial Narrative Processing System

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published
Publication date7/05/2018
Host publicationThe First Financial Narrative Processing Workshop: Proceedings of the 11th Edition of the Language Resources and Evaluation Conference - Miyazaki, Japan
EditorsMahmoud El-Haj, Paul Rayson, Andrew Moore
Pages52-58
Number of pages7
<mark>Original language</mark>English
EventThe 1st Financial Narrative Processing Workshop in LREC 2018 - Miyazaki, Japan
Duration: 7/05/2018 → …
http://wp.lancs.ac.uk/cfie/

Workshop

WorkshopThe 1st Financial Narrative Processing Workshop in LREC 2018
Abbreviated titleFNP 2018
Country/TerritoryJapan
City Miyazaki
Period7/05/18 → …
Internet address

Workshop

WorkshopThe 1st Financial Narrative Processing Workshop in LREC 2018
Abbreviated titleFNP 2018
Country/TerritoryJapan
City Miyazaki
Period7/05/18 → …
Internet address

Abstract

Large scale financial narrative processing for UK annual reports has only become possible in the last few years with our prior work on automatically understanding and extracting the structure of unstructured PDF glossy reports. This has levelled the playing field somewhat relative to US research where annual reports (10-K Forms) have a rigid structure imposed on them by legislation and are submitted in plain text format. The structure extraction is just the first step in a pipeline of analyses to examine disclosure quality and change over time relative to financial results. In this paper, we describe and evaluate the use of similar Information Extraction and Natural Language Processing methods for extraction and analysis of annual financial reports in a second language (Portuguese) in order to evaluate the applicability of our techniques in another national context (Portugal). Extraction accuracy varies between languages with English exceeding 95%. To further examine the robustness of our techniques, we apply the extraction methods on a comprehensive
sample of annual reports published by UK and Portuguese non-financial firms between 2003 and 2015.