This is an old revision of the document!


Corpora and other language resources

Corpora

corpus titlesizetimesourcelanguage
British National Corpus (BNC)100 million tokensmid 1970s - early 1990sOxfordBritish English
The Brown Corpus1 mio tokens1961ICAMEBritish English
The Lancaster/Oslo-Bergen Corpus (LOB)1 mio. tokens1961ICAMEBritish English
International Corpus of English (ICE)xxxxxxvarieties of world Englishesxxxxxxxxworld English
DWDS Kernkorpus 1900-1999 Berlin-Brandenburgische Akademie der Wissenschaften: https://www.dwds.de/d/korpora/kernGerman
DWDS Kernkorpus 21 2000-2010 Berlin-Brandenburgische Akademie der Wissenschaften: https://www.dwds.de/d/korpora/korpus21German
Deutscher Wortschatz Project35 mio. sentences, 500 mio. words http://wortschatz.uni-leipzig.de/German
Hamburg Dependency Treebank German news site heise.de, articles published between 1996 and 2001http://hdl.handle.net/11022/0000-0000-7FC7-2German
IDS-Corpora http://www.ids-mannheim.de/kt/corpora.htmlGerman
LIMAS-Korpus1 mio words, 500 texts / fragments1970shttp://www.korpora.org/Limas/German