Online corpora collections
IDS-Korpora:
German corpus archive, searchable as COSMAS (1.900.000.000 words !!!)
AC/DC Corpora:
Very large collection of Portuguese corpora (by Diana Santos, Linguateca), much of it PALAVRAS-annotated (Eckhard Bick)
Corpus del Espanol:
Fast and unabridged comparative corpus of historical and modern Spanish (by Mark Davies)
BNC's Corpus Page:
Overview of English corpora, online acces to the BNC (100 million words)
Cobuild Corpus:
Mixed British/American English corpus (50 million words), includeds transcribed speech (by Collins)
Internet corpus tools
webcorp:
Searching the internet as a corpus, slow but nice. Keeps going.
web-conc:
Concordancing with the whole internet as a corpus. Fast AND nice.
Other corpora link overviews
Corpora Links:
Link collection of corpora in many languages at the University of Tübingen (maintained by Laura Kallmeyer)
Corpus Linguistics:
Link collection of corpora in many languages (maintained by Michael Barlow)
Statistical NLP and computational corpus linguistics:
Link collection of corpora and NLP-resources in many languages at the Stanford InfoLab (maintained by Christopher Manning)