Anytime Guarantees for Reachability in Uncountable Markov Decision Processes.

LANCASTER UNIVERSITY LEIPZIG

Text available via DOI:

https://doi.org/10.4230/LIPIcs.CONCUR.2022.11
Final published version
Available under license: CC BY: Creative Commons Attribution 4.0 International License

Keywords

Markov decision process, Uncountable system, anytime guarantee, discrete-time Markov control process, probabilistic verification

View graph of relations

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Published

Standard

Anytime Guarantees for Reachability in Uncountable Markov Decision Processes. / Grover, Kush; Kretínský, Jan; Meggendorfer, Tobias et al.
33rd International Conference on Concurrency Theory, CONCUR 2022. ed. / Bartek Klin; Slawomir Lasota; Anca Muscholl. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. p. 11:1-11:20 11 (Leibniz International Proceedings in Informatics, LIPIcs; Vol. 243).

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review

Harvard

Grover, K, Kretínský, J, Meggendorfer, T & Weininger, M 2022, Anytime Guarantees for Reachability in Uncountable Markov Decision Processes. in B Klin, S Lasota & A Muscholl (eds), 33rd International Conference on Concurrency Theory, CONCUR 2022., 11, Leibniz International Proceedings in Informatics, LIPIcs, vol. 243, Schloss Dagstuhl - Leibniz-Zentrum für Informatik, pp. 11:1-11:20. https://doi.org/10.4230/LIPIcs.CONCUR.2022.11

APA

Grover, K., Kretínský, J., Meggendorfer, T., & Weininger, M. (2022). Anytime Guarantees for Reachability in Uncountable Markov Decision Processes. In B. Klin, S. Lasota, & A. Muscholl (Eds.), 33rd International Conference on Concurrency Theory, CONCUR 2022 (pp. 11:1-11:20). Article 11 (Leibniz International Proceedings in Informatics, LIPIcs; Vol. 243). Schloss Dagstuhl - Leibniz-Zentrum für Informatik. https://doi.org/10.4230/LIPIcs.CONCUR.2022.11

Vancouver

Grover K, Kretínský J, Meggendorfer T, Weininger M. Anytime Guarantees for Reachability in Uncountable Markov Decision Processes. In Klin B, Lasota S, Muscholl A, editors, 33rd International Conference on Concurrency Theory, CONCUR 2022. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. 2022. p. 11:1-11:20. 11. (Leibniz International Proceedings in Informatics, LIPIcs). doi: 10.4230/LIPIcs.CONCUR.2022.11

Author

Grover, Kush ; Kretínský, Jan ; Meggendorfer, Tobias et al. / Anytime Guarantees for Reachability in Uncountable Markov Decision Processes. 33rd International Conference on Concurrency Theory, CONCUR 2022. editor / Bartek Klin ; Slawomir Lasota ; Anca Muscholl. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. pp. 11:1-11:20 (Leibniz International Proceedings in Informatics, LIPIcs).

Bibtex

@inproceedings{ea97bf9c090748ff97436073fb67261c,

title = "Anytime Guarantees for Reachability in Uncountable Markov Decision Processes.",

abstract = "We consider the problem of approximating the reachability probabilities in Markov decision processes (MDP) with uncountable (continuous) state and action spaces. While there are algorithms that, for special classes of such MDP, provide a sequence of approximations converging to the true value in the limit, our aim is to obtain an algorithm with guarantees on the precision of the approximation. As this problem is undecidable in general, assumptions on the MDP are necessary. Our main contribution is to identify sufficient assumptions that are as weak as possible, thus approaching the “boundary” of which systems can be correctly and reliably analyzed. To this end, we also argue why each of our assumptions is necessary for algorithms based on processing finitely many observations. We present two solution variants. The first one provides converging lower bounds under weaker assumptions than typical ones from previous works concerned with guarantees. The second one then utilizes stronger assumptions to additionally provide converging upper bounds. Altogether, we obtain an anytime algorithm, i.e. yielding a sequence of approximants with known and iteratively improving precision, converging to the true value in the limit. Besides, due to the generality of our assumptions, our algorithms are very general templates, readily allowing for various heuristics from literature in contrast to, e.g., a specific discretization algorithm. Our theoretical contribution thus paves the way for future practical improvements without sacrificing correctness guarantees.",

keywords = "Markov decision process, Uncountable system, anytime guarantee, discrete-time Markov control process, probabilistic verification",

author = "Kush Grover and Jan Kret{\'i}nsk{\'y} and Tobias Meggendorfer and Maximilian Weininger",

year = "2022",

month = sep,

day = "6",

doi = "10.4230/LIPIcs.CONCUR.2022.11",

language = "English",

series = "Leibniz International Proceedings in Informatics, LIPIcs",

publisher = "Schloss Dagstuhl - Leibniz-Zentrum f{\"u}r Informatik",

pages = "11:1--11:20",

editor = "Bartek Klin and Slawomir Lasota and Anca Muscholl",

booktitle = "33rd International Conference on Concurrency Theory, CONCUR 2022",

}

RIS

TY - GEN

T1 - Anytime Guarantees for Reachability in Uncountable Markov Decision Processes.

AU - Grover, Kush

AU - Kretínský, Jan

AU - Meggendorfer, Tobias

AU - Weininger, Maximilian

PY - 2022/9/6

Y1 - 2022/9/6

N2 - We consider the problem of approximating the reachability probabilities in Markov decision processes (MDP) with uncountable (continuous) state and action spaces. While there are algorithms that, for special classes of such MDP, provide a sequence of approximations converging to the true value in the limit, our aim is to obtain an algorithm with guarantees on the precision of the approximation. As this problem is undecidable in general, assumptions on the MDP are necessary. Our main contribution is to identify sufficient assumptions that are as weak as possible, thus approaching the “boundary” of which systems can be correctly and reliably analyzed. To this end, we also argue why each of our assumptions is necessary for algorithms based on processing finitely many observations. We present two solution variants. The first one provides converging lower bounds under weaker assumptions than typical ones from previous works concerned with guarantees. The second one then utilizes stronger assumptions to additionally provide converging upper bounds. Altogether, we obtain an anytime algorithm, i.e. yielding a sequence of approximants with known and iteratively improving precision, converging to the true value in the limit. Besides, due to the generality of our assumptions, our algorithms are very general templates, readily allowing for various heuristics from literature in contrast to, e.g., a specific discretization algorithm. Our theoretical contribution thus paves the way for future practical improvements without sacrificing correctness guarantees.

AB - We consider the problem of approximating the reachability probabilities in Markov decision processes (MDP) with uncountable (continuous) state and action spaces. While there are algorithms that, for special classes of such MDP, provide a sequence of approximations converging to the true value in the limit, our aim is to obtain an algorithm with guarantees on the precision of the approximation. As this problem is undecidable in general, assumptions on the MDP are necessary. Our main contribution is to identify sufficient assumptions that are as weak as possible, thus approaching the “boundary” of which systems can be correctly and reliably analyzed. To this end, we also argue why each of our assumptions is necessary for algorithms based on processing finitely many observations. We present two solution variants. The first one provides converging lower bounds under weaker assumptions than typical ones from previous works concerned with guarantees. The second one then utilizes stronger assumptions to additionally provide converging upper bounds. Altogether, we obtain an anytime algorithm, i.e. yielding a sequence of approximants with known and iteratively improving precision, converging to the true value in the limit. Besides, due to the generality of our assumptions, our algorithms are very general templates, readily allowing for various heuristics from literature in contrast to, e.g., a specific discretization algorithm. Our theoretical contribution thus paves the way for future practical improvements without sacrificing correctness guarantees.

KW - Markov decision process

KW - Uncountable system

KW - anytime guarantee

KW - discrete-time Markov control process

KW - probabilistic verification

U2 - 10.4230/LIPIcs.CONCUR.2022.11

DO - 10.4230/LIPIcs.CONCUR.2022.11

M3 - Conference contribution/Paper

T3 - Leibniz International Proceedings in Informatics, LIPIcs

SP - 11:1-11:20

BT - 33rd International Conference on Concurrency Theory, CONCUR 2022

A2 - Klin, Bartek

A2 - Lasota, Slawomir

A2 - Muscholl, Anca

PB - Schloss Dagstuhl - Leibniz-Zentrum für Informatik

ER -

Research

Links

Text available via DOI:

Keywords