Site Tools

bilingual corpus

Bilingual corpora are comprised of language data from two languages, sometimes organized as parallel or comparable texts, i.e. texts of approximately the same contents and register. Some corpora are comprised of more than two languages in which case we speak of multilingual corpora. Note that there are different types of multilingual corpora: translation corpora, comparable corpora.

An example of a multilingual translation corpus is EUROPARL, the European Parliament Proceedings Parallel Corpus 1996-2011.