Topic Modelling of the Czech Supreme Court Decisions

Warning

This publication doesn't include Institute of Computer Science. It includes Faculty of Law. Official publication website can be found on muni.cz.
Authors

NOVOTNÁ Tereza HARAŠTA Jakub KÓL Jakub

Year of publication 2020
Type Article in Proceedings
Conference Proceedings of the Fourth Workshop on Automated Semantic Analysis of Information in Legal Text held online in conjunction with the 33rd International Conference on Legal Knowledge and Information Systems (JURIX 2020)
MU Faculty or unit

Faculty of Law

Citation
Web Open access sborníku
Keywords topic modelling; Latent Dirichlet Allocation; Non-negative Matrix Factorization; court decisions; coherence score
Description The Czech Supreme Court produces significant amount of decisions totalling more than 130 000 decisions since 1993. The amount makes it difficult for law practitioners to research this case law. This work focuses on topic models for enhanced information retrieval through identification of case law approaching the same or similar issues. We provide initial quantitative evaluation of Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF) models according to CV coherence score for different number of topics modelled n= {10, 20, ..., 90, 100}. Additionally, we provide qualitative evaluation for LDA and NMF models n= {20, 30} that will serve as a starting point for subsequent expert-user evaluation.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info