Home > Research > Publications & Outputs > Writing the Vernacular: Transcribing and Taggin...
View graph of relations

Writing the Vernacular: Transcribing and Tagging the Newcastle Electronic Corpus of Tyneside English

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Published

Standard

Writing the Vernacular: Transcribing and Tagging the Newcastle Electronic Corpus of Tyneside English. / Beal, J.; Corrigan, K.; Smith, N. et al.
In: Studies in Variation, Contacts and Change in English, Vol. 1, 2007.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

Harvard

APA

Vancouver

Beal J, Corrigan K, Smith N, Rayson P. Writing the Vernacular: Transcribing and Tagging the Newcastle Electronic Corpus of Tyneside English. Studies in Variation, Contacts and Change in English. 2007;1.

Author

Beal, J. ; Corrigan, K. ; Smith, N. et al. / Writing the Vernacular: Transcribing and Tagging the Newcastle Electronic Corpus of Tyneside English. In: Studies in Variation, Contacts and Change in English. 2007 ; Vol. 1.

Bibtex

@article{e5d8a981aab74581a76bf0a81672bf6f,
title = "Writing the Vernacular: Transcribing and Tagging the Newcastle Electronic Corpus of Tyneside English",
abstract = "The Newcastle Electronic Corpus of Tyneside English (NECTE) presented a number of problems not encountered by those producing corpora of standard varieties. The primary material consisted of audio recordings which needed to be orthographically transcribed and grammatically tagged. Preston (1985), (2000), Macaulay (1991), Kirk (1997), Cameron (2001) and Beal (2005) all note that representing vernacular Englishes orthographically, e.g. by using {"}eye dialect{"}, can be problematic on various levels. Apart from unwelcome associations with negative political, racial or social connotations, there are theoretical objections to devising non-standard spellings which represent certain groups of vernacular speakers, thus making their speech appear more differentiated from mainstream colloquial varieties than is warranted. In the first half of this paper, we outline the principles and methods adopted in devising an Orthographic Transcription Protocol (OTP) for such a vernacular corpus, and the challenges faced by the NECTE team in practice. Protocols for grammatical tagging have likewise been devised with standard varieties in mind. In the second half, we relate how existing part-of-speech (POS)-tagging software (CLAWS4, cf. Garside & Smith 1997; and Template Tagger, cf. Fligelstone et al. 1997) had to be adapted to take account of the non-standard grammar of Tyneside English.",
keywords = "cs_eprint_id, 1518 cs_uid, 355",
author = "J. Beal and K. Corrigan and N. Smith and P. Rayson",
year = "2007",
language = "English",
volume = "1",
journal = "Studies in Variation, Contacts and Change in English",
issn = "1797-4453",

}

RIS

TY - JOUR

T1 - Writing the Vernacular: Transcribing and Tagging the Newcastle Electronic Corpus of Tyneside English

AU - Beal, J.

AU - Corrigan, K.

AU - Smith, N.

AU - Rayson, P.

PY - 2007

Y1 - 2007

N2 - The Newcastle Electronic Corpus of Tyneside English (NECTE) presented a number of problems not encountered by those producing corpora of standard varieties. The primary material consisted of audio recordings which needed to be orthographically transcribed and grammatically tagged. Preston (1985), (2000), Macaulay (1991), Kirk (1997), Cameron (2001) and Beal (2005) all note that representing vernacular Englishes orthographically, e.g. by using "eye dialect", can be problematic on various levels. Apart from unwelcome associations with negative political, racial or social connotations, there are theoretical objections to devising non-standard spellings which represent certain groups of vernacular speakers, thus making their speech appear more differentiated from mainstream colloquial varieties than is warranted. In the first half of this paper, we outline the principles and methods adopted in devising an Orthographic Transcription Protocol (OTP) for such a vernacular corpus, and the challenges faced by the NECTE team in practice. Protocols for grammatical tagging have likewise been devised with standard varieties in mind. In the second half, we relate how existing part-of-speech (POS)-tagging software (CLAWS4, cf. Garside & Smith 1997; and Template Tagger, cf. Fligelstone et al. 1997) had to be adapted to take account of the non-standard grammar of Tyneside English.

AB - The Newcastle Electronic Corpus of Tyneside English (NECTE) presented a number of problems not encountered by those producing corpora of standard varieties. The primary material consisted of audio recordings which needed to be orthographically transcribed and grammatically tagged. Preston (1985), (2000), Macaulay (1991), Kirk (1997), Cameron (2001) and Beal (2005) all note that representing vernacular Englishes orthographically, e.g. by using "eye dialect", can be problematic on various levels. Apart from unwelcome associations with negative political, racial or social connotations, there are theoretical objections to devising non-standard spellings which represent certain groups of vernacular speakers, thus making their speech appear more differentiated from mainstream colloquial varieties than is warranted. In the first half of this paper, we outline the principles and methods adopted in devising an Orthographic Transcription Protocol (OTP) for such a vernacular corpus, and the challenges faced by the NECTE team in practice. Protocols for grammatical tagging have likewise been devised with standard varieties in mind. In the second half, we relate how existing part-of-speech (POS)-tagging software (CLAWS4, cf. Garside & Smith 1997; and Template Tagger, cf. Fligelstone et al. 1997) had to be adapted to take account of the non-standard grammar of Tyneside English.

KW - cs_eprint_id

KW - 1518 cs_uid

KW - 355

M3 - Journal article

VL - 1

JO - Studies in Variation, Contacts and Change in English

JF - Studies in Variation, Contacts and Change in English

SN - 1797-4453

ER -