Home > Research > Activities > Knowledge Transfer Event #1 on Text Processing ...
View graph of relations

Knowledge Transfer Event #1 on Text Processing for Annual Reports

Activity: Talk or presentation typesBusiness Course/Training

26/03/2024

First of two planned Knowledge Transfer Events with the FRC as part of project FRC2023-0131 Analyzing trends in annual report language and content. Participants from the FRC included Head of Innovation and Digital - Regulatory Standards, Project Manager Financial Reporting Lab, and Data Analyst Team Leader. The objective of the session was to provide FRC colleagues with an overview of the annual report dataset and text processing resources that our work has produced. The FRC's aim is to integrate our data and processing resources into their analysis and decision making. The session covered the following issues: 1. Overview of process for extracting text and document structure (table of contents vs. pdf bookmarks) 2. Overview of method for distinguishing between annual report content and financial statements 3. Overview of method for classifying (tagging) sections in the annual report to generic categories (e.g., chair’s letter, governance statement, etc.) 4. Summary of tagged sections (corpora) available for analysis 5. Overview of text processing pipeline (including tokenization, NER tagging, stop word removal, wordlist counts, etc.) 6. Overview of annual report database (2006-2022) and matching process to time-series consistency at the firm-level 7. Summary of available text resources 8. Key challenges faced in the research 9. Opportunities for the FRC from leveraging the data and methods

External organisation (External collaborations)

NameFinancial Reporting Council
Country/TerritoryUnited Kingdom