Spelling variation, although present in all varieties of English, is particularly prevalent in SMS text messaging. Researchers argue that spelling variants in SMSes are principled and meaningful, reflecting patterns of variation across historical and contemporary texts, and contributing to the performance of social identities. However, little attempt has yet been made to empirically validate SMS spelling patterns (for most languages, with the notable exception of French) and verify the extent to which they mirror those in other texts.
This article reports on the use of the VARD2 tool to analyse and normalise the spelling variation in a corpus of over 11,000 SMSes collected in the UK between 2004 and 2007. A second tool, DICER, was used to examine the variant and equivalent mappings from the normalised corpus. The database of rules and frequencies enables comparison with other text types and the automatic normalisation of spelling in larger SMS corpora.
As well as examining various spelling trends with the DICER analysis it was also possible to place the spelling variants found in the SMS corpus into functional categories; the ultimate aim being to create a taxonomy of SMS spelling. The article reports on the findings from this categorisation process, whilst also discussing the difficulty in choosing categories for some spelling variants.