Home > Research > Publications & Outputs > Text and speech corpora for natural language pr...
View graph of relations

Text and speech corpora for natural language processing and corpus linguistics

Research output: Contribution to Journal/MagazineSpecial issuepeer-review

Published

Standard

Text and speech corpora for natural language processing and corpus linguistics. / Demner-Fushman, Dina (Editor); Gatherer, Derek (Editor); Wu, Jian (Editor).
In: Scientific Data, Vol. Special Collection, 24.07.2025.

Research output: Contribution to Journal/MagazineSpecial issuepeer-review

Harvard

APA

Vancouver

Demner-Fushman D, (ed.), Gatherer D, (ed.), Wu J, (ed.). Text and speech corpora for natural language processing and corpus linguistics. Scientific Data. 2025 Jul 24;Special Collection.

Author

Demner-Fushman, Dina (Editor) ; Gatherer, Derek (Editor) ; Wu, Jian (Editor). / Text and speech corpora for natural language processing and corpus linguistics. In: Scientific Data. 2025 ; Vol. Special Collection.

Bibtex

@article{f585da30b89444aea495f3f7d59d5432,
title = "Text and speech corpora for natural language processing and corpus linguistics",
abstract = "Corpus Linguistics (CL) and Natural Language Processing (NLP) are two of the transformative forces in research across the sciences and humanities, reshaping how insights are gleaned from vast text and speech datasets. Their applications span the natural, medical, social and applied sciences, leading the cutting edge in fields such as healthcare diagnostics, biomedicine, environmental science, and computer vision. This Collection presents a series of annotated text and speech corpora alongside linguistic models tailored for CL and NLP applications. These resources aim to enrich the arsenals of CL and NLP users and facilitate interdisciplinary research.",
keywords = "Natural Language Processing, Corpus Linguistics, corpora, Artificial Intelligence, Machine Learning, Bioinformatics",
author = "Dina Demner-Fushman and Derek Gatherer and Jian Wu",
year = "2025",
month = jul,
day = "24",
language = "English",
volume = "Special Collection",
journal = "Scientific Data",
issn = "2052-4463",
publisher = "Nature Publishing Group",

}

RIS

TY - JOUR

T1 - Text and speech corpora for natural language processing and corpus linguistics

A2 - Demner-Fushman, Dina

A2 - Gatherer, Derek

A2 - Wu, Jian

PY - 2025/7/24

Y1 - 2025/7/24

N2 - Corpus Linguistics (CL) and Natural Language Processing (NLP) are two of the transformative forces in research across the sciences and humanities, reshaping how insights are gleaned from vast text and speech datasets. Their applications span the natural, medical, social and applied sciences, leading the cutting edge in fields such as healthcare diagnostics, biomedicine, environmental science, and computer vision. This Collection presents a series of annotated text and speech corpora alongside linguistic models tailored for CL and NLP applications. These resources aim to enrich the arsenals of CL and NLP users and facilitate interdisciplinary research.

AB - Corpus Linguistics (CL) and Natural Language Processing (NLP) are two of the transformative forces in research across the sciences and humanities, reshaping how insights are gleaned from vast text and speech datasets. Their applications span the natural, medical, social and applied sciences, leading the cutting edge in fields such as healthcare diagnostics, biomedicine, environmental science, and computer vision. This Collection presents a series of annotated text and speech corpora alongside linguistic models tailored for CL and NLP applications. These resources aim to enrich the arsenals of CL and NLP users and facilitate interdisciplinary research.

KW - Natural Language Processing

KW - Corpus Linguistics

KW - corpora

KW - Artificial Intelligence

KW - Machine Learning

KW - Bioinformatics

M3 - Special issue

VL - Special Collection

JO - Scientific Data

JF - Scientific Data

SN - 2052-4463

ER -