Home > Research > Publications & Outputs > Multilingual resources for European languages

Links

Text available via DOI:

View graph of relations

Multilingual resources for European languages: Contributions of the CRATER project

Research output: Contribution to journalJournal article

Published
Close
<mark>Journal publication date</mark>1/11/1997
<mark>Journal</mark>Literary and Linguistic Computing
Issue number4
Volume12
Number of pages8
Pages (from-to)219-226
Publication statusPublished
Original languageEnglish

Abstract

Here we describe the contributions of the CRATER project to the development of multilingual resources for European languages. The project has developed a trilingual parallel aligned corpus of one million tokens each of Spanish, French, and English. The corpus has been part-of-speech tagged and lemmatized. Tools for the alignment of multi-lingual corpora at the sentence and word levels hae been developed, which are of general significance to multilingual corpus linguistics. The Xerox part-of-speech tagger has also been retrained for Spanish, with important findings for part-of-speech tagging generally.