In conservation biology and natural resource management, adaptive management is an iterative process of improving management by reducing uncertainty via monitoring. Adaptive manag...
Iadine Chades, Josie Carwardine, Tara G. Martin, S...
Motivated by a machine learning perspective—that gametheoretic equilibria constraints should serve as guidelines for predicting agents’ strategies, we introduce maximum causal...
We consider model-based reinforcement learning in finite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...
We consider the problem of several users transmitting packets to a base station, and study an optimal scheduling formulation involving three communication layers, namely, the mediu...
Stochastic games generalize Markov decision processes MDPs to a multiagent setting by allowing the state transitions to depend jointly on all player actions, and having rewards de...
Michael J. Kearns, Yishay Mansour, Satinder P. Sin...
We present a novel POMDP planning algorithm called heuristic search value iteration (HSVI). HSVI is an anytime algorithm that returns a policy and a provable bound on its regret w...
This paper introduces the Point-Based Value Iteration (PBVI) algorithm for POMDP planning. PBVI approximates an exact value iteration solution by selecting a small set of represen...
Joelle Pineau, Geoffrey J. Gordon, Sebastian Thrun
The POMDP is considered as a powerful model for planning under uncertainty. However, it is usually impractical to employ a POMDP with exact parameters to model precisely the real-...