Towards Useful Word Embeddings: Evaluation on Information Retrieval, Text Classification, and Language Modeling

Varování

Publikace nespadá pod Ústav výpočetní techniky, ale pod Fakultu informatiky. Oficiální stránka publikace je na webu muni.cz.

Autoři	NOVOTNÝ Vít ŠTEFÁNIK Michal LUPTÁK Dávid SOJKA Petr
Rok publikování	2020
Druh	Článek ve sborníku
Konference	Proceedings of the Fourteenth Workshop on Recent Advances in Slavonic Natural Language Processing, RASLAN 2020
Fakulta / Pracoviště MU	Fakulta informatiky
Citace
www	workshop homepage PDF (fulltext)
Klíčová slova	Evaluation; word vectors; word2vec; fastText; information retrieval; text classification; language modeling
Popis	Since the seminal work of Mikolov et al. (2013), word vectors of log-bilinear models have found their way into many NLP applications and were extended with the positional model. Although the positional model improves accuracy on the intrinsic English word analogy task, prior work has neglected its evaluation on extrinsic end tasks, which correspond to real-world NLP applications. In this paper, we describe our first steps in evaluating positional weighting on the information retrieval, text classification, and language modeling extrinsic end tasks.
Související projekty:	Aplikovaný výzkum: softwarové architektury kritických infrastruktur, bezpečnost počítačových systémů, zpracování přirozeného jazyka a jazykové inženýrství, vizualizaci velkých dat a rozšířená realita. Zapojení studentů Fakulty informatiky do mezinárodní vědecké komunity 20