Selecting Interesting Articles Using Their Similarity Based Only on Positive Examples

Warning

This publication doesn't include Institute of Computer Science. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

HROZA Jiří ŽIŽKA Jan

Year of publication 2005
Type Article in Proceedings
Conference Computational linguistics and Intelligent Text Processing
MU Faculty or unit

Faculty of Informatics

Citation
Field Informatics
Keywords machine learning; text categorization; text filtration; text similarity; k-NN; ranking
Description The task of automated searching for interesting text documents frequently suffers from a~very poor balance among documents representing both positive and negative examples or from one completely missing class. This paper suggests the ranking approach based on the k-NN algorithm adapted for determining the similarity degree of new documents just to the representative positive collection. From the viewpoint of the precision-recall relation, a~user can decide in advance how many and how similar articles should be released through a filter.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info