Sciweavers

85 search results - page 14 / 17
» Approximate Policy Iteration with a Policy Language Bias
Sort
View
ICML
2010
IEEE
13 years 8 months ago
Convergence of Least Squares Temporal Difference Methods Under General Conditions
We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...
Huizhen Yu
SC
1995
ACM
13 years 11 months ago
Parallel Matrix-Vector Product Using Approximate Hierarchical Methods
Matrix-vector products (mat-vecs) form the core of iterative methods used for solving dense linear systems. Often, these systems arise in the solution of integral equations used i...
Ananth Grama, Vipin Kumar, Ahmed H. Sameh
ICRA
2008
IEEE
167views Robotics» more  ICRA 2008»
14 years 1 months ago
An approximate algorithm for solving oracular POMDPs
Abstract— We propose a new approximate algorithm, LAJIV (Lookahead J-MDP Information Value), to solve Oracular Partially Observable Markov Decision Problems (OPOMDPs), a special ...
Nicholas Armstrong-Crews, Manuela M. Veloso
ECML
2004
Springer
14 years 22 days ago
Filtered Reinforcement Learning
Reinforcement learning (RL) algorithms attempt to assign the credit for rewards to the actions that contributed to the reward. Thus far, credit assignment has been done in one of t...
Douglas Aberdeen
ATAL
2006
Springer
13 years 11 months ago
Decentralized planning under uncertainty for teams of communicating agents
Decentralized partially observable Markov decision processes (DEC-POMDPs) form a general framework for planning for groups of cooperating agents that inhabit a stochastic and part...
Matthijs T. J. Spaan, Geoffrey J. Gordon, Nikos A....