Sciweavers

86 search results - page 6 / 18
» Estimation and Approximation Bounds for Gradient-Based Reinf...
Sort
View
JMLR
2010
125views more  JMLR 2010»
13 years 2 months ago
Maximum Likelihood in Cost-Sensitive Learning: Model Specification, Approximations, and Upper Bounds
The presence of asymmetry in the misclassification costs or class prevalences is a common occurrence in the pattern classification domain. While much interest has been devoted to ...
Jacek P. Dmochowski, Paul Sajda, Lucas C. Parra
ML
2008
ACM
152views Machine Learning» more  ML 2008»
13 years 7 months ago
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
Abstract. We consider batch reinforcement learning problems in continuous space, expected total discounted-reward Markovian Decision Problems. As opposed to previous theoretical wo...
András Antos, Csaba Szepesvári, R&ea...
NIPS
2000
13 years 9 months ago
Using Free Energies to Represent Q-values in a Multiagent Reinforcement Learning Task
The problem of reinforcement learning in large factored Markov decision processes is explored. The Q-value of a state-action pair is approximated by the free energy of a product o...
Brian Sallans, Geoffrey E. Hinton
NIPS
2007
13 years 9 months ago
Incremental Natural Actor-Critic Algorithms
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
ICML
2002
IEEE
14 years 8 months ago
Learning from Scarce Experience
Searching the space of policies directly for the optimal policy has been one popular method for solving partially observable reinforcement learning problems. Typically, with each ...
Leonid Peshkin, Christian R. Shelton