Scaling Model-Based Average-Reward Reinforcement Learning for Product Delivery

14 years 4 months ago

Download web.engr.oregonstate.edu

Reinforcement learning in real-world domains suffers from three curses of dimensionality: explosions in state and action spaces, and high stochasticity. We present approaches that mitigate each of these curses. To handle the state-space explosion, we introduce "tabular linear functions" that generalize tile-coding and linear value functions. Action space complexity is reduced by replacing complete joint action space search with a form of hill climbing. To deal with high stochasticity, we introduce a new algorithm called ASH-learning, which is an afterstate version of H-Learning. Our extensions make it practical to apply reinforcement learning to a domain of product delivery - an optimization problem that combines inventory control and vehicle routing.

Scott Proper, Prasad Tadepalli

Real-time Traffic

Action Space | ECML 2006 | Machine Learning | Real-world Domains Suffers | Reinforcement Learning |

claim paper

Post Info
More Details (n/a)

Added	22 Aug 2010
Updated	22 Aug 2010
Type	Conference
Year	2006
Where	ECML
Authors	Scott Proper, Prasad Tadepalli

Comments (0)

Sciweavers

Scaling Model-Based Average-Reward Reinforcement Learning for Product Delivery

Action Space | ECML 2006 | Machine Learning | Real-world Domains Suffers | Reinforcement Learning |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers