Sciweavers

771 search results - page 63 / 155
» Markov Decision Processes with Arbitrary Reward Processes
Sort
View
CORR
2010
Springer
105views Education» more  CORR 2010»
15 years 2 months ago
Optimism in Reinforcement Learning Based on Kullback-Leibler Divergence
We consider model-based reinforcement learning in finite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...
Sarah Filippi, Olivier Cappé, Aurelien Gari...
PERCOM
2007
ACM
16 years 3 months ago
Sensor Scheduling for Optimal Observability Using Estimation Entropy
We consider sensor scheduling as the optimal observability problem for partially observable Markov decision processes (POMDP). This model fits to the cases where a Markov process ...
Mohammad Rezaeian
ICML
2005
IEEE
16 years 4 months ago
A causal approach to hierarchical decomposition of factored MDPs
We present Variable Influence Structure Analysis, an algorithm that dynamically performs hierarchical decomposition of factored Markov decision processes. Our algorithm determines...
Anders Jonsson, Andrew G. Barto
NIPS
2004
15 years 5 months ago
VDCBPI: an Approximate Scalable Algorithm for Large POMDPs
Existing algorithms for discrete partially observable Markov decision processes can at best solve problems of a few thousand states due to two important sources of intractability:...
Pascal Poupart, Craig Boutilier
ICANN
2001
Springer
15 years 8 months ago
Market-Based Reinforcement Learning in Partially Observable Worlds
Unlike traditional reinforcement learning (RL), market-based RL is in principle applicable to worlds described by partially observable Markov Decision Processes (POMDPs), where an ...
Ivo Kwee, Marcus Hutter, Jürgen Schmidhuber