Home > Research > Publications & Outputs > Developing Asian language corpora: standards an...

Electronic data

View graph of relations

Developing Asian language corpora: standards and practice.

Research output: Contribution to conference - Without ISBN/ISSN Conference paperpeer-review

Published
Publication date25/03/2004
Number of pages8
<mark>Original language</mark>English
EventThe 4th Workshop on Asian Language Resources - Sanya, China
Duration: 25/03/2004 → …

Conference

ConferenceThe 4th Workshop on Asian Language Resources
CitySanya, China
Period25/03/04 → …

Abstract

This paper first discusses standards for developing Asian language corpora so as to facilitate international data exchange. Following this, we present two corpora of Asian languages developed at Lancaster University - the EMILLE Corpus, which contains 14 South Asian languages, and the Lancaster Corpus of Mandarin Chinese. Finally, we will demonstrate how to explore these corpora using Xara and other corpus tools.