/httpd/html/Corpus Eye

The menu-based cqp-interface (round flags) was designed by Eckhard Bick and programmed in Java by Poul Henriksen and Nikolaj Hald Nielsen on a special ISK-grant. It uses the IMS corpus workbench and was inspired by user feed back and a similar interface made by Paul Meurer for Oslo University's Tekstlaboratoriet. The older, grep-based search system (rectangular flags) and the treebank interfaces were designed and programmed by Eckhard Bick for VISL, using ordinary linux-tools and Tgrep2. Currently, CorpusEye maintenance and development is handled by Tino Didriksen.

Raw text corpora were either harvested from the internet and through provider APIs (Facebook, Twitter), downloaded from existing repositories (Leipzig Wortschatz corpora, Wikipedia dumps), licensed (ECI, DSL) or kindly provided by project partners (Oxford University, Linguateca, ATILF, NILC, Red Hen Lab, the Danish parliament and others). Some sources were scanned and OCR-converted at the ISK (Skalk) or acquired by ISK employees through private channels. For a full list of corpus credits and references see our copyright page, which is also linked from the individual corpus pages.

Grammatical corpus annotation, both morphosyntactic tags (CG) and tree-structures (PSG), was performed with Eckhard Bick's VISL parsers: PALAVRAS (Portuguese), PALAVRAS-HIS (Spanish), DanGram (Danish), GerGram (German), EngGram (English), SweGram (Swedish), NorGram (Norwegian Bokmål), EspGram (Esperanto), ItaGram (Italian) and FrAG (French), which are all accessible online (including file upload service). For Romanian, the morphological annotation was performed with Dan Tufis' probabilistic MSD tagger.

Treebank revision was supervised work involving, among others, the following VISL-students: Susanna Afonsoand Raquel Marchi (Portuguese), Ina Størner Rasmussen, Camilla Pedersen, Dorte Lønsmann and Kim Ebensgaard Jensen (Danish), and Ane Dybro Johansen (French). The treebank projects had funding support by Linguateca (Portuguese), The Nordic Council of Ministers (Danish) and ATILF (French).

More information on the VISL project as well as live grammatical analysis and a number of grammar teaching tools are available at the VISL main site or its research oriented beta version.