Home > Research > Publications & Outputs > A morphosyntactic categorisation scheme for the...
View graph of relations

A morphosyntactic categorisation scheme for the automated analysis of Nepali

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNChapter (peer-reviewed)

Published

Standard

A morphosyntactic categorisation scheme for the automated analysis of Nepali. / Hardie, Andrew; Lohani, Ram Raj; Regmi, Bhim N. et al.
Annual Review of South Asian Languages and Linguistics 2009. ed. / Rajendra Singh. Berlin: Mouton de Gruyter, 2009. p. 171-196 (Trends in linguistics. Studies and monographs; Vol. 222).

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNChapter (peer-reviewed)

Harvard

Hardie, A, Lohani, RR, Regmi, BN & Yadava, YP 2009, A morphosyntactic categorisation scheme for the automated analysis of Nepali. in R Singh (ed.), Annual Review of South Asian Languages and Linguistics 2009. Trends in linguistics. Studies and monographs, vol. 222, Mouton de Gruyter, Berlin, pp. 171-196.

APA

Hardie, A., Lohani, R. R., Regmi, B. N., & Yadava, Y. P. (2009). A morphosyntactic categorisation scheme for the automated analysis of Nepali. In R. Singh (Ed.), Annual Review of South Asian Languages and Linguistics 2009 (pp. 171-196). (Trends in linguistics. Studies and monographs; Vol. 222). Mouton de Gruyter.

Vancouver

Hardie A, Lohani RR, Regmi BN, Yadava YP. A morphosyntactic categorisation scheme for the automated analysis of Nepali. In Singh R, editor, Annual Review of South Asian Languages and Linguistics 2009. Berlin: Mouton de Gruyter. 2009. p. 171-196. (Trends in linguistics. Studies and monographs).

Author

Hardie, Andrew ; Lohani, Ram Raj ; Regmi, Bhim N. et al. / A morphosyntactic categorisation scheme for the automated analysis of Nepali. Annual Review of South Asian Languages and Linguistics 2009. editor / Rajendra Singh. Berlin : Mouton de Gruyter, 2009. pp. 171-196 (Trends in linguistics. Studies and monographs).

Bibtex

@inbook{1218355a545a45d58763c762a615ed05,
title = "A morphosyntactic categorisation scheme for the automated analysis of Nepali",
abstract = "This paper describes the linguistic rationale underlying the part-of-speechtagset used for tagging the Nepali National Corpus. In particular, threeconceptually complex areas are discussed in detail. In the first place, thenature of Nepali postpositions is explored, and the approach that the tagsettakes to them – in which postpositions are tokenised separately to thenouns or other words to which they are attached – is justified. A similarexploration of gender marking, however, supports an opposite approach,where gender is treated as a feature of the word on which it is marked, andindicated in that word{\textquoteright}s tag. It is further argued that an inconsistenttreatment of gender on nouns, as opposed to adjectives and other wordsthat agree with nouns, is justified for Nepali. Thirdly, the very greatcomplexity of Nepali verb inflection (some of it created by very productivecompounding) is shown to necessitate the use, within the tagset, of asimplified model of the Nepali verb. A brief analysis of the similarities anddifferences between this tagset and part-of-speech annotation schemes forsome closely related is undertaken. Finally, the implementation of thetagset in an automated tagging system is summarised and some directionsfor future work outlined.",
author = "Andrew Hardie and Lohani, {Ram Raj} and Regmi, {Bhim N.} and Yadava, {Yogendra P.}",
year = "2009",
language = "English",
isbn = "9783110225594",
series = "Trends in linguistics. Studies and monographs",
publisher = "Mouton de Gruyter",
pages = "171--196",
editor = "Rajendra Singh",
booktitle = "Annual Review of South Asian Languages and Linguistics 2009",

}

RIS

TY - CHAP

T1 - A morphosyntactic categorisation scheme for the automated analysis of Nepali

AU - Hardie, Andrew

AU - Lohani, Ram Raj

AU - Regmi, Bhim N.

AU - Yadava, Yogendra P.

PY - 2009

Y1 - 2009

N2 - This paper describes the linguistic rationale underlying the part-of-speechtagset used for tagging the Nepali National Corpus. In particular, threeconceptually complex areas are discussed in detail. In the first place, thenature of Nepali postpositions is explored, and the approach that the tagsettakes to them – in which postpositions are tokenised separately to thenouns or other words to which they are attached – is justified. A similarexploration of gender marking, however, supports an opposite approach,where gender is treated as a feature of the word on which it is marked, andindicated in that word’s tag. It is further argued that an inconsistenttreatment of gender on nouns, as opposed to adjectives and other wordsthat agree with nouns, is justified for Nepali. Thirdly, the very greatcomplexity of Nepali verb inflection (some of it created by very productivecompounding) is shown to necessitate the use, within the tagset, of asimplified model of the Nepali verb. A brief analysis of the similarities anddifferences between this tagset and part-of-speech annotation schemes forsome closely related is undertaken. Finally, the implementation of thetagset in an automated tagging system is summarised and some directionsfor future work outlined.

AB - This paper describes the linguistic rationale underlying the part-of-speechtagset used for tagging the Nepali National Corpus. In particular, threeconceptually complex areas are discussed in detail. In the first place, thenature of Nepali postpositions is explored, and the approach that the tagsettakes to them – in which postpositions are tokenised separately to thenouns or other words to which they are attached – is justified. A similarexploration of gender marking, however, supports an opposite approach,where gender is treated as a feature of the word on which it is marked, andindicated in that word’s tag. It is further argued that an inconsistenttreatment of gender on nouns, as opposed to adjectives and other wordsthat agree with nouns, is justified for Nepali. Thirdly, the very greatcomplexity of Nepali verb inflection (some of it created by very productivecompounding) is shown to necessitate the use, within the tagset, of asimplified model of the Nepali verb. A brief analysis of the similarities anddifferences between this tagset and part-of-speech annotation schemes forsome closely related is undertaken. Finally, the implementation of thetagset in an automated tagging system is summarised and some directionsfor future work outlined.

M3 - Chapter (peer-reviewed)

SN - 9783110225594

T3 - Trends in linguistics. Studies and monographs

SP - 171

EP - 196

BT - Annual Review of South Asian Languages and Linguistics 2009

A2 - Singh, Rajendra

PB - Mouton de Gruyter

CY - Berlin

ER -