On the evaluation and optimization of LabeledPAM

Warning

This publication doesn't include Institute of Computer Science. It includes Faculty of Informatics. Official publication website can be found on muni.cz.
Authors

JÁNOŠOVÁ Miriama LANG Andreas BUDÍKOVÁ Petra SCHUBERT Erich DOHNAL Vlastislav

Year of publication 2025
Type Article in Periodical
Magazine / Source Information Systems
MU Faculty or unit

Faculty of Informatics

Citation
web https://www.sciencedirect.com/science/article/pii/S030643792500064X
Doi https://doi.org/10.1016/j.is.2025.102580
Keywords semi-supervised clustering; k-medoids; partitioning around medoids; FasterPAM; semi-supervised classification
Description The analysis of complex and weakly labeled data is increasingly popular. Traditional unsupervised clustering aims to uncover interrelated sets of objects based on feature-based similarity. This approach often reaches its limits when dealing with complex multimedia data due to the curse of dimensionality, presenting unique challenges. Semi-supervised clustering, which leverages small amounts of labeled data, has the potential to cope with this problem. In this work, we delve into LabeledPAM, a semi-supervised clustering method, which extends FasterPAM, a state-of-the-art ??-medoids clustering algorithm. Our algorithm is designed for both semi-supervised classification, where labels are assigned to clusters with minimal labeled data, and semi-supervised clustering, where new clusters with unknown labels are identified. We propose an optimization to the original LabeledPAM algorithm that reduces its computational complexity. Additionally, we provide an implementation in Rust, which integrates seamlessly with Python libraries. To assess LabeledPAM’s performance, we empirically evaluate its properties by comparing it against a range of semi-supervised clustering algorithms, including density-based ones. We conduct experiments on a collection of real-world datasets. Our results demonstrate that LabeledPAM achieves competitive clustering quality while maintaining efficiency across various scenarios, showing its versatility for real-world applications.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info