Home > Research > Publications & Outputs > Tracing verbal aggression over time, using the ...
View graph of relations

Tracing verbal aggression over time, using the Historical Thesaurus of English

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paper

Published

Standard

Tracing verbal aggression over time, using the Historical Thesaurus of English. / Malory, Beth.
Corpus Linguistics 2015. 2015. p. 27-27.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paper

Harvard

Malory, B 2015, Tracing verbal aggression over time, using the Historical Thesaurus of English. in Corpus Linguistics 2015. pp. 27-27, University Centre for Computer Corpus Research on Language. August 2014, Lancaster , United Kingdom, 21/07/15. <https://ucrel.lancs.ac.uk/cl2015/doc/CL2015-AbstractBook.pdf>

APA

Vancouver

Author

Bibtex

@inproceedings{4f779e902fec46788679bdc13a83db71,
title = "Tracing verbal aggression over time, using the Historical Thesaurus of English",
abstract = "The work reported here seeks to demonstrate thatautomatic content analysis tools can be usedeffectively to trace pragmatic phenomena –including aggression – over time. In doing so, itbuilds upon preliminary work conducted by Archer(2014), using Wmatrix (Rayson 2008), in whichArcher used six semtags – Q2.2 (speech acts),A5.1+/- ({\textquoteleft}good/bad{\textquoteright} evaluation), A5.2+/-({\textquoteleft}true/false{\textquoteright} evaluation), E3- ({\textquoteleft}angry/violent{\textquoteright}),S1.2.4+/- ({\textquoteleft}im/politeness{\textquoteright}), and S7.2+/-({\textquoteleft}respect/lack of respect{\textquoteright}) – to examine aggressionin 200 Old Bailey trial texts covering the decade1783-93.Having annotated the aforementioned Old Baileydataset using Wmatrix, Archer (2014) targeted theutterances captured by the semtags listed above.This afforded her a useful “way in” to (by providingmultiple potential indicators of) verbal aggression inthe late eighteenth-century English courtroom.Using the {\textquoteleft}expand context{\textquoteright} facility within Wmatrix,and consulting the original trial transcripts, thoseincidences identified as verbally aggressive werethen re-contextualised – thereby allowing Archer todisregard any that did not point to aggression in thefinal instance. The success of this approach allowedher to conclude that automatic content analysis toolslike USAS can indeed be used to trace pragmaticphenomena (and in historical as well as moderntexts).This approach was not without its teethingproblems, however. First, apart from those semtagswhich were used in conjunction with others, asportmanteau tags (e.g. Q2.2 with E3- to captureaggressive speech acts), the approach necessitatedthe targeting of individual semtags within a giventext. The need to perform a time-intensive manualexamination of the wider textual context thus madethe use of large datasets prohibitive. Furthermore,there was a closely related problem concerning thetagset{\textquoteright}s basis in The Longman Lexicon ofContemporary English (McArthur, 1981), and itsconsequent inability to take account of diachronicmeaning change. This tended to result in theoccasional mis-assignment of words which havebeen subject to significant semantic change overtime, including politely, insult and insulted. In one instance, for example, politely was used to describethe deftness with which a thief picked his victim{\textquoteright}spocket! The need for manual checks to prevent suchmis-assignments from affecting results furthernecessitated the narrowness of scope to whichArcher (2014) was subject.In the extension to this work, reported here, theauthors present their solutions to these problems.These solutions have at their core an innovationwhich allows historical datasets to be taggedsemantically, using themes derived fromthe Historical Thesaurus of the Oxford EnglishDictionary (henceforth HTOED). These themes havebeen identified as part of an AHRC/ESRC fundedproject entitled “Semantic Annotation and Mark Upfor Enhancing Lexical Searches”, henceforthSAMUELS11 (grant reference AH/L010062/1). TheSAMUELS project has also enabled researchersfrom the Universities Glasgow, Lancaster,Huddersfield, Strathclyde and Central Lancashire towork together to develop a semantic annotation toolwhich, thanks to its advanced disambiguationfacility, enables the automatic annotation of words,as well as multi-word units, in historical texts withtheir precise meanings. This means that pragmaticphenomena such as aggression can be moreprofitably sought automatically following the initialidentification of what the authors have termed a{\textquoteleft}meaning chain{\textquoteright}, that is, a series of HTOED-derived{\textquoteleft}themes{\textquoteright} analogous to DNA strings.This paper reports, first, on the authors{\textquoteright}identification of 68 potentially pertinent HTOED{\textquoteleft}themes{\textquoteright} and, second, on their investigation of thepossible permutations of these themes, and theprocess by which they assessed which themes inwhich combinations best identified andcaptured aggression in their four datasets. The datasets used for this research are drawnfrom Hansard and from Historic Hansard; and aretaken from periods judged to be characterized, insome way, by political/national unrest ordisquiet. The datasets represent the periods 1812-14(i.e., “The War of 1812” between Great Britain andAmerica), 1879-81 (a period of complex wranglingbetween two English governments and theiropposition, led by fierce rivals Disraeli andGladstone), 1913-19 (the First World War, includingits immediate build-up and aftermath), and 1978-9(“The Winter of Discontent”). ",
author = "Beth Malory",
year = "2015",
month = jul,
day = "21",
language = "English",
pages = "27--27",
booktitle = "Corpus Linguistics 2015",
note = "University Centre for Computer Corpus Research on Language. August 2014 ; Conference date: 21-07-2015 Through 24-07-2015",

}

RIS

TY - GEN

T1 - Tracing verbal aggression over time, using the Historical Thesaurus of English

AU - Malory, Beth

PY - 2015/7/21

Y1 - 2015/7/21

N2 - The work reported here seeks to demonstrate thatautomatic content analysis tools can be usedeffectively to trace pragmatic phenomena –including aggression – over time. In doing so, itbuilds upon preliminary work conducted by Archer(2014), using Wmatrix (Rayson 2008), in whichArcher used six semtags – Q2.2 (speech acts),A5.1+/- (‘good/bad’ evaluation), A5.2+/-(‘true/false’ evaluation), E3- (‘angry/violent’),S1.2.4+/- (‘im/politeness’), and S7.2+/-(‘respect/lack of respect’) – to examine aggressionin 200 Old Bailey trial texts covering the decade1783-93.Having annotated the aforementioned Old Baileydataset using Wmatrix, Archer (2014) targeted theutterances captured by the semtags listed above.This afforded her a useful “way in” to (by providingmultiple potential indicators of) verbal aggression inthe late eighteenth-century English courtroom.Using the ‘expand context’ facility within Wmatrix,and consulting the original trial transcripts, thoseincidences identified as verbally aggressive werethen re-contextualised – thereby allowing Archer todisregard any that did not point to aggression in thefinal instance. The success of this approach allowedher to conclude that automatic content analysis toolslike USAS can indeed be used to trace pragmaticphenomena (and in historical as well as moderntexts).This approach was not without its teethingproblems, however. First, apart from those semtagswhich were used in conjunction with others, asportmanteau tags (e.g. Q2.2 with E3- to captureaggressive speech acts), the approach necessitatedthe targeting of individual semtags within a giventext. The need to perform a time-intensive manualexamination of the wider textual context thus madethe use of large datasets prohibitive. Furthermore,there was a closely related problem concerning thetagset’s basis in The Longman Lexicon ofContemporary English (McArthur, 1981), and itsconsequent inability to take account of diachronicmeaning change. This tended to result in theoccasional mis-assignment of words which havebeen subject to significant semantic change overtime, including politely, insult and insulted. In one instance, for example, politely was used to describethe deftness with which a thief picked his victim’spocket! The need for manual checks to prevent suchmis-assignments from affecting results furthernecessitated the narrowness of scope to whichArcher (2014) was subject.In the extension to this work, reported here, theauthors present their solutions to these problems.These solutions have at their core an innovationwhich allows historical datasets to be taggedsemantically, using themes derived fromthe Historical Thesaurus of the Oxford EnglishDictionary (henceforth HTOED). These themes havebeen identified as part of an AHRC/ESRC fundedproject entitled “Semantic Annotation and Mark Upfor Enhancing Lexical Searches”, henceforthSAMUELS11 (grant reference AH/L010062/1). TheSAMUELS project has also enabled researchersfrom the Universities Glasgow, Lancaster,Huddersfield, Strathclyde and Central Lancashire towork together to develop a semantic annotation toolwhich, thanks to its advanced disambiguationfacility, enables the automatic annotation of words,as well as multi-word units, in historical texts withtheir precise meanings. This means that pragmaticphenomena such as aggression can be moreprofitably sought automatically following the initialidentification of what the authors have termed a‘meaning chain’, that is, a series of HTOED-derived‘themes’ analogous to DNA strings.This paper reports, first, on the authors’identification of 68 potentially pertinent HTOED‘themes’ and, second, on their investigation of thepossible permutations of these themes, and theprocess by which they assessed which themes inwhich combinations best identified andcaptured aggression in their four datasets. The datasets used for this research are drawnfrom Hansard and from Historic Hansard; and aretaken from periods judged to be characterized, insome way, by political/national unrest ordisquiet. The datasets represent the periods 1812-14(i.e., “The War of 1812” between Great Britain andAmerica), 1879-81 (a period of complex wranglingbetween two English governments and theiropposition, led by fierce rivals Disraeli andGladstone), 1913-19 (the First World War, includingits immediate build-up and aftermath), and 1978-9(“The Winter of Discontent”).

AB - The work reported here seeks to demonstrate thatautomatic content analysis tools can be usedeffectively to trace pragmatic phenomena –including aggression – over time. In doing so, itbuilds upon preliminary work conducted by Archer(2014), using Wmatrix (Rayson 2008), in whichArcher used six semtags – Q2.2 (speech acts),A5.1+/- (‘good/bad’ evaluation), A5.2+/-(‘true/false’ evaluation), E3- (‘angry/violent’),S1.2.4+/- (‘im/politeness’), and S7.2+/-(‘respect/lack of respect’) – to examine aggressionin 200 Old Bailey trial texts covering the decade1783-93.Having annotated the aforementioned Old Baileydataset using Wmatrix, Archer (2014) targeted theutterances captured by the semtags listed above.This afforded her a useful “way in” to (by providingmultiple potential indicators of) verbal aggression inthe late eighteenth-century English courtroom.Using the ‘expand context’ facility within Wmatrix,and consulting the original trial transcripts, thoseincidences identified as verbally aggressive werethen re-contextualised – thereby allowing Archer todisregard any that did not point to aggression in thefinal instance. The success of this approach allowedher to conclude that automatic content analysis toolslike USAS can indeed be used to trace pragmaticphenomena (and in historical as well as moderntexts).This approach was not without its teethingproblems, however. First, apart from those semtagswhich were used in conjunction with others, asportmanteau tags (e.g. Q2.2 with E3- to captureaggressive speech acts), the approach necessitatedthe targeting of individual semtags within a giventext. The need to perform a time-intensive manualexamination of the wider textual context thus madethe use of large datasets prohibitive. Furthermore,there was a closely related problem concerning thetagset’s basis in The Longman Lexicon ofContemporary English (McArthur, 1981), and itsconsequent inability to take account of diachronicmeaning change. This tended to result in theoccasional mis-assignment of words which havebeen subject to significant semantic change overtime, including politely, insult and insulted. In one instance, for example, politely was used to describethe deftness with which a thief picked his victim’spocket! The need for manual checks to prevent suchmis-assignments from affecting results furthernecessitated the narrowness of scope to whichArcher (2014) was subject.In the extension to this work, reported here, theauthors present their solutions to these problems.These solutions have at their core an innovationwhich allows historical datasets to be taggedsemantically, using themes derived fromthe Historical Thesaurus of the Oxford EnglishDictionary (henceforth HTOED). These themes havebeen identified as part of an AHRC/ESRC fundedproject entitled “Semantic Annotation and Mark Upfor Enhancing Lexical Searches”, henceforthSAMUELS11 (grant reference AH/L010062/1). TheSAMUELS project has also enabled researchersfrom the Universities Glasgow, Lancaster,Huddersfield, Strathclyde and Central Lancashire towork together to develop a semantic annotation toolwhich, thanks to its advanced disambiguationfacility, enables the automatic annotation of words,as well as multi-word units, in historical texts withtheir precise meanings. This means that pragmaticphenomena such as aggression can be moreprofitably sought automatically following the initialidentification of what the authors have termed a‘meaning chain’, that is, a series of HTOED-derived‘themes’ analogous to DNA strings.This paper reports, first, on the authors’identification of 68 potentially pertinent HTOED‘themes’ and, second, on their investigation of thepossible permutations of these themes, and theprocess by which they assessed which themes inwhich combinations best identified andcaptured aggression in their four datasets. The datasets used for this research are drawnfrom Hansard and from Historic Hansard; and aretaken from periods judged to be characterized, insome way, by political/national unrest ordisquiet. The datasets represent the periods 1812-14(i.e., “The War of 1812” between Great Britain andAmerica), 1879-81 (a period of complex wranglingbetween two English governments and theiropposition, led by fierce rivals Disraeli andGladstone), 1913-19 (the First World War, includingits immediate build-up and aftermath), and 1978-9(“The Winter of Discontent”).

M3 - Conference contribution/Paper

SP - 27

EP - 27

BT - Corpus Linguistics 2015

T2 - University Centre for Computer Corpus Research on Language. August 2014

Y2 - 21 July 2015 through 24 July 2015

ER -