Knowledge Transfer Event #1 on Text Processing for Annual Reports
Activity: Talk or presentation types › Business Course/Training
First of two planned Knowledge Transfer Events with the FRC as part of project FRC2023-0131 Analyzing trends in annual report language and content. Participants from the FRC included Head of Innovation and Digital - Regulatory Standards, Project Manager Financial Reporting Lab, and Data Analyst Team Leader. The objective of the session was to provide FRC colleagues with an overview of the annual report dataset and text processing resources that our work has produced. The FRC's aim is to integrate our data and processing resources into their analysis and decision making. The session covered the following issues: 1. Overview of process for extracting text and document structure (table of contents vs. pdf bookmarks) 2. Overview of method for distinguishing between annual report content and financial statements 3. Overview of method for classifying (tagging) sections in the annual report to generic categories (e.g., chair’s letter, governance statement, etc.) 4. Summary of tagged sections (corpora) available for analysis 5. Overview of text processing pipeline (including tokenization, NER tagging, stop word removal, wordlist counts, etc.) 6. Overview of annual report database (2006-2022) and matching process to time-series consistency at the firm-level 7. Summary of available text resources 8. Key challenges faced in the research 9. Opportunities for the FRC from leveraging the data and methods
Name | Financial Reporting Council |
---|
Country/Territory | United Kingdom |
---|