Sciweavers

102 search results - page 5 / 21
» MDPs with Non-Deterministic Policies
Sort
View
NIPS
2000
13 years 8 months ago
APRICODD: Approximate Policy Construction Using Decision Diagrams
We propose a method of approximate dynamic programming for Markov decision processes (MDPs) using algebraic decision diagrams (ADDs). We produce near-optimal value functions and p...
Robert St-Aubin, Jesse Hoey, Craig Boutilier
ORL
2007
50views more  ORL 2007»
13 years 7 months ago
NP-Hardness of checking the unichain condition in average cost MDPs
The unichain condition requires that every policy in an MDP result in a single ergodic class, and guarantees that the optimal average cost is independent of the initial state. We ...
John N. Tsitsiklis
ATAL
2006
Springer
13 years 11 months ago
On the relationship between MDPs and the BDI architecture
In this paper we describe the initial results of an investigation into the relationship between Markov Decision Processes (MDPs) and Belief-Desire-Intention (BDI) architectures. W...
Gerardo I. Simari, Simon Parsons
ICML
2010
IEEE
13 years 8 months ago
Inverse Optimal Control with Linearly-Solvable MDPs
We present new algorithms for inverse optimal control (or inverse reinforcement learning, IRL) within the framework of linearlysolvable MDPs (LMDPs). Unlike most prior IRL algorit...
Dvijotham Krishnamurthy, Emanuel Todorov
AAMAS
2010
Springer
13 years 7 months ago
Coordinated learning in multiagent MDPs with infinite state-space
Abstract In this paper we address the problem of simultaneous learning and coordination in multiagent Markov decision problems (MMDPs) with infinite state-spaces. We separate this ...
Francisco S. Melo, M. Isabel Ribeiro