Guidelines for normalising early modern English corpora - Research Portal

Home > Research > Publications & Outputs > Guidelines for normalising early modern English...

Computing and Communications

Associated organisational units

Electronic data

icame-2015-0001
Rights statement: © 2015. This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. (CC BY-NC-ND 3.0)
Final published version, 471 KB, PDF document
Available under license: CC BY-NC-ND: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

Text available via DOI:

https://doi.org/10.1515/icame-2015-0001
Final published version
Available under license: None

View graph of relations

Guidelines for normalising early modern English corpora: decisions and justifications

Research output: Contribution to Journal/Magazine › Journal article › peer-review

Published

Dawn Archer
Merja Kytö
Alistair Baron
Paul Edward Rayson

More...

<mark>Journal publication date</mark>	1/03/2015
<mark>Journal</mark>	ICAME Journal
Issue number	1
Volume	39
Number of pages	20
Pages (from-to)	5-24
Publication Status	Published
<mark>Original language</mark>	English

Abstract

Corpora of Early Modern English have been collected and released for research for a number of years. With large scale digitisation activities gathering pace in the last decade, much more historical textual data is now available for research on numerous topics including historical linguistics and conceptual history. We summarise previous research which has shown that it is necessary to map historical spelling variants to modern equivalents in order to successfully apply natural language processing and corpus linguistics methods. Manual and semiautomatic methods have been devised to support this normalisation and standardisation process. We argue that it is important to develop a linguistically meaningful rationale to achieve good results from this process. In order to do so, we propose a number of guidelines for normalising corpora and show how these guidelines have been applied in the Corpus of English Dialogues.

Research

Associated organisational units

Electronic data

Links

Text available via DOI:

Guidelines for normalising early modern English corpora: decisions and justifications

Abstract

Bibliographic note

Quick Links

Connect With Us

Faculties & Depts

Contact Us