Final published version
Research output: Contribution to Journal/Magazine › Journal article › peer-review
<mark>Journal publication date</mark> | 1/11/1997 |
---|---|
<mark>Journal</mark> | Literary and Linguistic Computing |
Issue number | 4 |
Volume | 12 |
Number of pages | 8 |
Pages (from-to) | 219-226 |
Publication Status | Published |
<mark>Original language</mark> | English |
Here we describe the contributions of the CRATER project to the development of multilingual resources for European languages. The project has developed a trilingual parallel aligned corpus of one million tokens each of Spanish, French, and English. The corpus has been part-of-speech tagged and lemmatized. Tools for the alignment of multi-lingual corpora at the sentence and word levels hae been developed, which are of general significance to multilingual corpus linguistics. The Xerox part-of-speech tagger has also been retrained for Spanish, with important findings for part-of-speech tagging generally.