Home > Research > Publications & Outputs > Text and speech corpora for natural language pr...
View graph of relations

Text and speech corpora for natural language processing and corpus linguistics

Research output: Contribution to Journal/MagazineSpecial issuepeer-review

Published
Close
<mark>Journal publication date</mark>24/07/2025
<mark>Journal</mark>Scientific Data
VolumeSpecial Collection
Publication StatusPublished
<mark>Original language</mark>English

Abstract

Corpus Linguistics (CL) and Natural Language Processing (NLP) are two of the transformative forces in research across the sciences and humanities, reshaping how insights are gleaned from vast text and speech datasets. Their applications span the natural, medical, social and applied sciences, leading the cutting edge in fields such as healthcare diagnostics, biomedicine, environmental science, and computer vision. This Collection presents a series of annotated text and speech corpora alongside linguistic models tailored for CL and NLP applications. These resources aim to enrich the arsenals of CL and NLP users and facilitate interdisciplinary research.