316 KB, PDF document
Research output: Contribution to conference - Without ISBN/ISSN › Conference paper › peer-review
Research output: Contribution to conference - Without ISBN/ISSN › Conference paper › peer-review
}
TY - CONF
T1 - Developing Asian language corpora: standards and practice.
AU - Xiao, R. Z.
AU - McEnery, A. M.
AU - Baker, J. P.
AU - Hardie, Andrew
PY - 2004/3/25
Y1 - 2004/3/25
N2 - This paper first discusses standards for developing Asian language corpora so as to facilitate international data exchange. Following this, we present two corpora of Asian languages developed at Lancaster University - the EMILLE Corpus, which contains 14 South Asian languages, and the Lancaster Corpus of Mandarin Chinese. Finally, we will demonstrate how to explore these corpora using Xara and other corpus tools.
AB - This paper first discusses standards for developing Asian language corpora so as to facilitate international data exchange. Following this, we present two corpora of Asian languages developed at Lancaster University - the EMILLE Corpus, which contains 14 South Asian languages, and the Lancaster Corpus of Mandarin Chinese. Finally, we will demonstrate how to explore these corpora using Xara and other corpus tools.
KW - standards
KW - corpora
KW - Asian languages
M3 - Conference paper
T2 - The 4th Workshop on Asian Language Resources
Y2 - 25 March 2004
ER -