Home > Research > Publications & Outputs > Supporting the corpus-based study of Shakespear...


Text available via DOI:

View graph of relations

Supporting the corpus-based study of Shakespeare’s language: Enhancing a corpus of the First Folio

Research output: Contribution to Journal/MagazineJournal articlepeer-review

<mark>Journal publication date</mark>1/05/2021
<mark>Journal</mark>ICAME Journal
Issue number1
Number of pages50
Pages (from-to)37-86
Publication StatusPublished
<mark>Original language</mark>English


This article explores challenges in the corpus linguistic analysis of Shakes-peare’s language, and Early Modern English more generally, with particularfocus on elaborating possible solutions and the benefits they bring. An accountof work that took place within the Encyclopedia of Shakespeare’s LanguageProject (2016–2019) is given, which discusses the development of the project’sdata resources, specifically, the Enhanced Shakespearean Corpus. Topics cov-ered include the composition of the corpus and its subcomponents; the structureof the XML markup; the design of the extensive character metadata; and theword-level corpus annotation, including spelling regularisation, part-of-speechtagging, lemmatisation and semantic tagging. The challenges that arise fromeach of these undertakings are not exclusive to a corpus-based treatment ofShakespeare’s plays but it is in the context of Shakespeare’s language that theyare so severe as to seem almost insurmountable. The solutions developed for theEnhanced Shakespearean Corpus – often combining automated manipulationwith manual interventions, and always principled – offer a way through.