Sciweavers

995 search results - page 84 / 199
» Learning Useful Horn Approximations
Sort
View
ML
2008
ACM
152views Machine Learning» more  ML 2008»
13 years 10 months ago
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
Abstract. We consider batch reinforcement learning problems in continuous space, expected total discounted-reward Markovian Decision Problems. As opposed to previous theoretical wo...
András Antos, Csaba Szepesvári, R&ea...
CORR
2012
Springer
196views Education» more  CORR 2012»
12 years 5 months ago
PAC-Bayesian Policy Evaluation for Reinforcement Learning
Bayesian priors offer a compact yet general means of incorporating domain knowledge into many learning tasks. The correctness of the Bayesian analysis and inference, however, lar...
Mahdi Milani Fard, Joelle Pineau, Csaba Szepesv&aa...
ML
2002
ACM
168views Machine Learning» more  ML 2002»
13 years 9 months ago
On Average Versus Discounted Reward Temporal-Difference Learning
We provide an analytical comparison between discounted and average reward temporal-difference (TD) learning with linearly parameterized approximations. We first consider the asympt...
John N. Tsitsiklis, Benjamin Van Roy
IWLCS
2005
Springer
14 years 3 months ago
Counter Example for Q-Bucket-Brigade Under Prediction Problem
Aiming to clarify the convergence or divergence conditions for Learning Classifier System (LCS), this paper explores: (1) an extreme condition where the reinforcement process of ...
Atsushi Wada, Keiki Takadama, Katsunori Shimohara
NN
2008
Springer
13 years 10 months ago
Multilayer in-place learning networks for modeling functional layers in the laminar cortex
Currently, there is a lack of general-purpose in-place learning networks that model feature layers in the cortex. By "general-purpose" we mean a general yet adaptive hig...
Juyang Weng, Tianyu Luwang, Hong Lu, Xiangyang Xue