DSL Shared task 2016: Perfect Is The Enemy of Good Language Discrimination Through Expectation-Maximization and Chunk-based Language Model
| Autoři | |
|---|---|
| Rok publikování | 2016 |
| Druh | Článek ve sborníku |
| Konference | Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3) |
| Fakulta / Pracoviště MU | |
| Citace | |
| www | https://aclanthology.info/pdf/W/W16/W16-4815.pdf |
| Obor | Informatika |
| Klíčová slova | language discrimination;expectation maximization;language model |
| Popis | In this paper we investigate two approaches to discrimination of similar languages: Expectation--maximization algorithm for estimating conditional probability P(word|language) and byte level language models similar to compression-based language modelling methods. The accuracy of these methods reached respectively 86.6 % and 88.3 % on set A of the DSL Shared task 2016 competition. |
| Související projekty: |