Optimizing Local Satisfaction of Long-Run Average Objectives in Markov Decision Processes

Varování

Publikace nespadá pod Ústav výpočetní techniky, ale pod Fakultu informatiky. Oficiální stránka publikace je na webu muni.cz.

Autoři	KLAŠKA David KUČERA Antonín KŮR Vojtěch MUSIL Vít ŘEHÁK Vojtěch
Rok publikování	2024
Druh	Článek ve sborníku
Konference	Proceedings of 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)
Fakulta / Pracoviště MU	Fakulta informatiky
Citace
www	Paper URL
Doi	https://doi.org/10.1609/aaai.v38i18.29993
Klíčová slova	Markov decision processes; invariant distribution
Přiložené soubory	paper.pdf
Popis	Long-run average optimization problems for Markov decision processes (MDPs) require constructing policies with optimal steady-state behavior, i.e., optimal limit frequency of visits to the states. However, such policies may suffer from local instability in the sense that the frequency of states visited in a bounded time horizon along a run differs significantly from the limit frequency. In this work, we propose an efficient algorithmic solution to this problem.
Související projekty:	Models, Algorithms, and Tools for Solving Adversarial Security Problems Cyber-security Excellence Hub in Estonia and South Moravia (CHESS) VESCAA: Verifikovatelná a efektivní syntéza kontrolerů pro autonomní agenty