Are We There Yet? A Thorough Evaluation of POS Tagging on Czech
Autoři | |
---|---|
Rok publikování | 2025 |
Druh | Článek ve sborníku |
Konference | Text, Speech, and Dialogue, 28th International Conference, TSD 2025 |
Fakulta / Pracoviště MU | |
Citace | |
www | Konferenční sborník |
Doi | https://doi.org/10.1007/978-3-032-02551-7_23 |
Klíčová slova | morphological analysis; evaluation; POS tagging |
Popis | With recent advances in natural language processing, part-of-speech (POS) tagging is one of the areas that has seen significant improvements. Contemporary state-of-the-art tools report accuracies approaching 100% even for morphologically rich languages such as Czech that used to pose a challenge in the past. In this study, we investigate whether such accuracy is reproducible on real-world data, as previous research has demonstrated substantial discrepancies between evaluations conducted on gold-standard corpora and those based on text typically occurring on the web. To address this issue, we selected a set of widely used and well-established POS taggers and applied them to a random sample of documents from the csTenTen23 web corpus. Tokens, for which the taggers produced differing outputs, were then manually annotated. Our results indicate that the ability of modern POS taggers to handle real-world data – including a broad range of genres and topics – has improved significantly in comparison to the earlier statistically based POS taggers. Furthermore, we observe a shift in the most problematic tagging category: whereas case assignment was previously a major source of errors, the best current models struggle more with POS category distinctions. We argue that this shift may reflect ambiguities inherent in the POS category itself, where even human annotators may not fully agree. |
Související projekty: |