Home > Research > Publications & Outputs > A corpus-based contrastive analysis of modal ad...

Electronic data

  • 2023jehangirphd

    Final published version, 3.6 MB, PDF document

    Available under license: CC BY-NC-ND: Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License

Text available via DOI:

View graph of relations

A corpus-based contrastive analysis of modal adverbs of certainty in English and Urdu

Research output: ThesisDoctoral Thesis

Publication date26/04/2023
Number of pages315
Awarding Institution
Award date26/04/2023
  • Lancaster University
<mark>Original language</mark>English


This study uses the corpus-based contrastive approach to explore the syntactic patterns and semantic and pragmatic meanings of modal adverbs of certainty (MACs) in English and Urdu. MACs are a descriptive category of epistemic modal adverb that semantically express a degree of certainty.
Due to the paucity of research to date on Urdu MACs, the study draws on existing literature on English MACs for cross-linguistic description of characteristics of English and Urdu MACs. A framework is constructed based on Boye’s (2012) description of syntactic characteristics of MACs, in terms of clause type and position within the clause; and on Simon-Vandenbergen and Aijmer’s (2007) description of their functional characteristics including both semantic (e.g. certainty, possibility) and pragmatic (e.g. authority, politeness) functions. Following Boye’s (2012) model, MACs may be grouped according to meaning: high certainty support – HCS (e.g. certainly); probability support – PS (e.g. perhaps); probability support for negative content – PSNC (e.g. perhaps not); and high certainty support for negative content – HCSNC (e.g. certainly not).
Methodologically, the framework identified as suitable is one that primarily follows earlier studies that relied on corpus-based methods and parallel and comparable corpora for cross-linguistic comparative or contrastive analysis of some linguistic element or pattern. An approach to grammatical description based on such works as Quirk et al. (1985) and Biber et al. (1999) is likewise identified as suitable for this study.
An existing parallel corpus (EMILLE) and newly created comparable monolingual corpora of English and Urdu are utilised. The novel comparable corpora are web-based, comprised of news and chat forum texts; the data is POS-tagged. Using the parallel corpus, Urdu MACs equivalent to the English MACs preidentified from the existing literature are identified. Then, the comparable corpora are used to extract data on the relative frequencies of MACs and their distribution across various text types. This quantitative analysis demonstrates that in both languages all four semantic categories of MAC are found in all text types, but the distribution across text types is not uniform. HCS MACs, although diverse, are considerably lower in frequency than PS MACs in both English and Urdu. HCSNC and PSNC MACs are notably rarer than HCS and PS MACs in both languages.
The analysis demonstrates striking similarities in the syntactic positioning of MACs in English and Urdu, with minor differences. Except for Urdu PSNC MACs, all categories most frequently occur in clause medial position, in both independent and dependent clauses, in both languages. This difference is because hō nahīṁ saktā ‘possibly not’ is most frequent in clause final position.
MACs in both languages most often have scope over the whole clause in which they occur; semantically, the core function of MACs is to express speaker’s certainty and high confidence (for HCS and HCSNC) or low certainty and low confidence (for PS and PSNC) in the truth of a proposition. These groups thus primarily function as certainty markers and probability markers, respectively. In both languages, speakers also use MACs short responses to questions, and in responses to their own rhetorical questions. HCS and PS MACs in clause final position may in addition function as tags which prompt a response from the interlocutor. When they cooccur with modal verbs, MACs emphasise or downtone, but do not entirely change, the modal verb’s epistemic or deontic meaning. In both languages, all MACs preferentially occur in the then-clause of a conditional sentence.
Pragmatically, MACs are used for emphasis, expectation, counter-expectation and politeness. Additionally, HCS and HCSNC MACs are used to express solidarity and authority, and PS and PSNC MACs are used as hedges. Readings of expectation, hedge, politeness, and solidarity may be relevant simultaneously. Interestingly, reduplication for emphasis, common in Urdu, is only observed for one Urdu MAC, żarūr ‘definitely’, whereas all English MACs reduplicate for emphasis in at least some cases. Another difference is that, in Urdu, the sequence śāyad nahīṁ yaqīnān ‘not perhaps, certainly’ expresses speaker authority within a response to a previous speaker, but no English MAC exhibits this behaviour.
Despite overall similarity, minor dissimilarities in the use of English and Urdu MACs are observable, in the use of MACs as replies to questions, and in their use within interrogative clauses. This analysis supports the contention that, cross-linguistically, despite linguistic variation, the conceptual structures and functional-communicative considerations that shape natural languages are largely universal.
This study makes two main contributions. First, conducting a descriptive analysis of English and Urdu MACs using a corpus-based contrastive method both illuminates this specific question in modality but also sets a precedent for future corpus-based descriptive studies of Urdu. The second is its inclusion of priorly considered distinct categories of modal adverbs of certainty and possibility in a single category of modal adverbs that are used to express a degree of certainty, i.e. MACs. From the practical standpoint, an additional contribution of this study is the creation and open release of a large Urdu corpus designed for comparable corpus research, the Lancaster Urdu Web Corpus, fulfilling a need for such a corpus in the field.