MUNI-NLP Systems for Lower Sorbian-German and Lower Sorbian-Upper Sorbian Machine Translation @ WMT22
| Autoři | |
|---|---|
| Rok publikování | 2022 |
| Druh | Článek ve sborníku |
| Konference | Proceedings of the Seventh Conference on Machine Translation |
| Fakulta / Pracoviště MU | |
| Citace | |
| www | https://www.statmt.org/wmt22/pdf/2022.wmt-1.109.pdf |
| Klíčová slova | NLP;machine translation;low-resource |
| Přiložené soubory | |
| Popis | We describe our neural machine translation systems for the WMT22 shared task on unsupervised MT and very low resource supervised MT. We submit supervised NMT systems for Lower Sorbian-German and Lower Sorbian-Upper Sorbian translation in both directions. By using a novel tokenization algorithm, data augmentation techniques, such as Data Diversification (DD), and parameter optimization we improve on our baselines by 10.5-10.77 BLEU for Lower Sorbian-German and by 1.52-1.88 BLEU for Lower Sorbian-Upper Sorbian. |
| Související projekty: |