Action disabled: source

Corpora and other language resources

Tag sets

Corpora

corpus titlesizetimesourcelanguage
British National Corpus (BNC)100 million tokensmid 1970s - early 1990sOxfordBritish English
The Brown Corpus1 mio tokens1961ICAMEBritish English
The Lancaster/Oslo-Bergen Corpus (LOB)1 mio. tokens1961ICAMEBritish English
International Corpus of English (ICE)xxxxxxvarieties of world EnglishesInternational Corpus of English (ICE) at Zuerich, CHworld English
Mark Davies' English Corporaxxxxxxdiverse set of corporaMark DaviesAmerican English, British English, international English
Textcorpora in the DWDS div. div. https://www.dwds.de/r German
DWDS Kernkorpus 1900-1999 Berlin-Brandenburgische Akademie der Wissenschaften: https://www.dwds.de/d/korpora/kernGerman
DWDS Kernkorpus 21 2000-2010 Berlin-Brandenburgische Akademie der Wissenschaften: https://www.dwds.de/d/korpora/korpus21German
Hamburg Dependency Treebank German news site heise.de, articles published between 1996 and 2001http://hdl.handle.net/11022/0000-0000-7FC7-2German
IDS-Corpora http://www.ids-mannheim.de/kt/corpora.htmlGerman
LIMAS-Korpus1 mio words, 500 texts / fragments1970shttp://www.korpora.org/Limas/German
Arabic News Texts Corpus (AntCorpus) https://antcorpus.github.io/Arabic
Wortschatz Leipzigvarious sample sizesArabic, English, French, German, Russian misc. https://wortschatz.uni-leipzig.de/de/downloadvarious
SpråkbankenText https://spraakbanken.gu.se/en/resourcesSwedish