European Union Language Resources in Sketch Engine
| Autoři | |
|---|---|
| Rok publikování | 2016 |
| Druh | Článek ve sborníku |
| Konference | Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) |
| Fakulta / Pracoviště MU | |
| Citace | |
| www | http://www.lrec-conf.org/proceedings/lrec2016/pdf/572_Paper.pdf |
| Obor | Informatika |
| Klíčová slova | JRC-Acquis; DCEP; DGT-TM; Europarl; EUR-Lex; Sketch Engine; parallel corpus; word sketch; parallel concordance |
| Popis | Several parallel corpora built from European Union language resources are presented here. They were processed by state-of-the-art tools and made available for researchers in the Sketch Engine corpus management system. A completely new resource is introduced: EUR-Lex corpus, being one of the largest parallel corpus available at the moment, containing 840 million tokens of English and having the largest language pair (English-French) with more than 25 million aligned segments (paragraphs). |
| Související projekty: |