Sciweavers

ATAL
2015
Springer

Incremental Policy Iteration with Guaranteed Escape from Local Optima in POMDP Planning

8 years 8 months ago
Incremental Policy Iteration with Guaranteed Escape from Local Optima in POMDP Planning
Partially observable Markov decision processes (POMDPs) provide a natural framework to design applications that continuously make decisions based on noisy sensor measurements. The recent proliferation of smart phones and other wearable devices leads to new applications where, unfortunately, energy efficiency becomes an issue. To circumvent energy requirements, finite-state controllers can be applied because they are computationally inexpensive to execute. Additionally, when multi-agent POMDPs (e.g. Dec-POMDPs or I-POMDPs) are taken into account, finite-state controllers become one of the most important policy representations. Online methods scale the best; however, they are energy demanding. Thus methods to optimize finite-state controllers are necessary. In this paper, we present a new, efficient approach to bounded policy interaction (BPI). BPI keeps the size of the controller small which is a desirable property for applications, especially on small devices. However, finding a...
Marek Grzes, Pascal Poupart
Added 16 Apr 2016
Updated 16 Apr 2016
Type Journal
Year 2015
Where ATAL
Authors Marek Grzes, Pascal Poupart
Comments (0)