Building a corpus of spoken sylheti.

Research output: Contribution to Journal/MagazineJournal articlepeer-review

<mark>Journal publication date</mark>12/2000
<mark>Journal</mark>Literary and Linguistic Computing
Issue number4
Number of pages12
Pages (from-to)421-432
Publication StatusPublished
<mark>Original language</mark>English


This paper describes the construction of a corpus of spoken Sylheti. The corpus was created to examine difficulties in the creation of spoken language corpora in which features such as code switching (simply described here as the process of switching from one language to another during the course of an interaction; however, this description disguises a host of situations, which will be examined in the paper) are common. The paper also presents a transliteration scheme for Sylheti based around the Roman alphabet.