Other corpora

This section provides a list of other commonly used corpora that are not available on the BYU portal. Most of these don’t provide an online search interface but require you to download and compile their files in order to use them (check the Make your own corpus tab for guidance). Refer to Corpus-based Linguistics Links and the Corpus Resource Database(CoRD) for a longer list of corpora and a corpus finder.

Diachronic corpora

ARCHER (A Representative Corpus of Historical English Registers)

DCPSE (Diachronic Corpus of Present Day Spoken English)

Helsinki Corpus of English Texts

Lampeter Corpus of Early Modern English Tracts

CEEC (Corpus of Early English Correspondence) and daughter corpora

Old Bailey Corpus (Late Modern English)

Brown/LOB family corpora

BROWN Corpus of American English, 1960s

LOB (Lancaster – Oslo/Bergen) Corpus of British English, 1960s

FLOB (Freiburg – Lancaster-Oslo/Bergen – Corpus) of British English, 1990s

FROWN (Freiburg-Brown Corpus) of American English, 1990s

CLOB Corpus (British English, 2009)

Crown Corpus (American English, 2009)

Other British and American English corpora

ANC (American National Corpus) – analogous to the BNC

CSAE (Corpus of Spoken American English)/Santa Barbara Corpus of Spoken American English

MICASE (Michigan Corpus of Academic Spoken English)

LLC (London-Lund Corpus), spoken British English

SCOTS (Scottish Corpus of Texts & Speech)

Corpora of other English varieties

ACE (Australian Corpus of English)

FRED (Freiburg English Dialects)

ICE (International Corpus of English), different varieties of English – e.g. the ICE-GB

Kolhapur Corpus of Indian English

WC (Wellington Corpus of Written New Zealand English)

WSC (Wellington Corpus of Spoken New Zealand English)

CIE (Corpus of Irish English), 14th – 20th century

Specialised corpora

CSPAE (Corpus of Spoken Professional American English)

COLT (Bergen Corpus of London Teenage English)

ICLE (International Corpus of Learner English)

British Academic Written (BAWE) and Spoken English (BASE)