Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

129

ICML
2010
IEEE

favoriteEmaildiscussreport

188views Machine Learning» more ICML 2010»

Constructing States for Reinforcement Learning

15 years 3 days ago

Constructing States for Reinforcement Learning

Download www.icml2010.org

POMDPs are the models of choice for reinforcement learning (RL) tasks where the environment cannot be observed directly. In many applications we need to learn the POMDP structure and parameters from experience and this is considered to be a difficult problem. In this paper we address this issue by modeling the hidden environment with a novel class of models that are less expressive, but easier to learn and plan with than POMDPs. We call these models deterministic Markov models (DMMs), which are deterministic-probabilistic finite automata from learning theory, extended with actions to the sequential (rather than i.i.d.) setting. Conceptually, we extend the Utile Suffix Memory method of McCallum to handle long term memory. We describe DMMs, give Bayesian algorithms for learning and planning with them and also present experimental results for some standard POMDP tasks and tasks to illustrate its efficacy.

M. M. Hassan Mahmud

Real-time Traffic

ICML 2010 | Long Term Memory | Machine Learning | Standard Pomdp Tasks | Utile Suffix Memory |

claim paper

Related Content

» Automatic basis function construction for approximate dynamic programming and reinforcemen...

» Feature Construction for Reinforcement Learning in Hearts

» Using Reinforcement Learning to Build a Better Model of Dialogue State

» State Space Reduction For Hierarchical Reinforcement Learning

» Constructing action set from basis functions for reinforcement learning of robot control

» Discovering Hierarchy in Reinforcement Learning with HEXQ

» VisionBased Reinforcement Learning for Purposive Behavior Acquisition

» Identifying useful subgoals in reinforcement learning by local graph partitioning

» Agent Learning in Relational Domains based on Logical MDPs with Negation

Post Info
More Details (n/a)

Added	12 Feb 2011
Updated	12 Feb 2011
Type	Journal
Year	2010
Where	ICML
Authors	M. M. Hassan Mahmud

Comments (0)