Efficient Strategy Synthesis for MDPs with Resource Constraints
release_ocu2obhazrgpbbmqjtam7gnu64
by
František Blahoudek, Petr Novotný, Melkior Ornik, Pranay Thangeda, Ufuk Topcu
2021
Abstract
We consider qualitative strategy synthesis for the formalism called
consumption Markov decision processes. This formalism can model dynamics of an
agents that operates under resource constraints in a stochastic environment.
The presented algorithms work in time polynomial with respect to the
representation of the model and they synthesize strategies ensuring that a
given set of goal states will be reached (once or infinitely many times) with
probability 1 without resource exhaustion. In particular, when the amount of
resource becomes too low to safely continue in the mission, the strategy
changes course of the agent towards one of a designated set of reload states
where the agent replenishes the resource to full capacity; with sufficient
amount of resource, the agent attempts to fulfill the mission again.
We also present two heuristics that attempt to reduce expected time that the
agent needs to fulfill the given mission, a parameter important in practical
planning. The presented algorithms were implemented and numerical examples
demonstrate (i) the effectiveness (in terms of computation time) of the
planning approach based on consumption Markov decision processes and (ii) the
positive impact of the two heuristics on planning in a realistic example.
In text/plain
format
Archived Files and Locations
application/pdf 3.4 MB
file_zfrqi7p3zrd6fgb45kknf6rlje
|
arxiv.org (repository) web.archive.org (webarchive) |
2105.02099v1
access all versions, variants, and formats of this works (eg, pre-prints)