Accepted author manuscript
Licence: CC BY: Creative Commons Attribution 4.0 International License
Final published version
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Chapter (peer-reviewed) › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Chapter (peer-reviewed) › peer-review
}
TY - CHAP
T1 - Multilingual Financial Narrative Processing
T2 - Analysing Annual Reports in English, Spanish and Portuguese
AU - El Haj, Mahmoud
AU - Rayson, Paul Edward
AU - Young, Steven Eric
AU - Alves, Paulo
AU - Herrero Zorita, Carlos
PY - 2019/2
Y1 - 2019/2
N2 - This chapter describes and evaluates the use of Information Extraction and Natural Language Processing methods for extraction and analysis of financial annual reports in three languages: English, Spanish and Portuguese. The work described retains information on document structure which is needed to enable a clear distinction between narrative and financial statement components of annual reports and between individual sections within the narratives component. Extraction accuracy varies between languages with English exceeding 95 %. We apply the extraction methods on a comprehensive sample of annual reports published by UK, Spanish and Portuguese non-financial firms between 2003 and 2014.
AB - This chapter describes and evaluates the use of Information Extraction and Natural Language Processing methods for extraction and analysis of financial annual reports in three languages: English, Spanish and Portuguese. The work described retains information on document structure which is needed to enable a clear distinction between narrative and financial statement components of annual reports and between individual sections within the narratives component. Extraction accuracy varies between languages with English exceeding 95 %. We apply the extraction methods on a comprehensive sample of annual reports published by UK, Spanish and Portuguese non-financial firms between 2003 and 2014.
U2 - 10.1142/11116
DO - 10.1142/11116
M3 - Chapter (peer-reviewed)
SN - 9789813274877
BT - Multilingual Text Analysis
A2 - Litvak, Marina
A2 - Vanetik, Natalia
PB - World Scientific Publishing
ER -