DSL Shared task 2016: Perfect Is The Enemy of Good Language Discrimination Through Expectation-Maximization and Chunk-based Language Model
| Authors | |
|---|---|
| Year of publication | 2016 |
| Type | Article in Proceedings |
| Conference | Proceedings of the Third Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial3) |
| MU Faculty or unit | |
| Citation | |
| web | https://aclanthology.info/pdf/W/W16/W16-4815.pdf |
| Field | Informatics |
| Keywords | language discrimination;expectation maximization;language model |
| Description | In this paper we investigate two approaches to discrimination of similar languages: Expectation--maximization algorithm for estimating conditional probability P(word|language) and byte level language models similar to compression-based language modelling methods. The accuracy of these methods reached respectively 86.6 % and 88.3 % on set A of the DSL Shared task 2016 competition. |
| Related projects: |