Search Sciweavers | Sciweavers

86 search results - page 6 / 18

» Estimation and Approximation Bounds for Gradient-Based Reinf...

209

click to vote

JMLR
2010

125views more JMLR 2010»

Maximum Likelihood in Cost-Sensitive Learning: Model Specification, Approximations, and Upper Bounds

15 years 1 months ago

Download bme.ccny.cuny.edu

The presence of asymmetry in the misclassification costs or class prevalences is a common occurrence in the pattern classification domain. While much interest has been devoted to ...

Jacek P. Dmochowski, Paul Sajda, Lucas C. Parra

claim paper

Read More »

227

click to vote

ML
2008
ACM

152views Machine Learning» more ML 2008»

Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path

15 years 7 months ago

Download hal.inria.fr

Abstract. We consider batch reinforcement learning problems in continuous space, expected total discounted-reward Markovian Decision Problems. As opposed to previous theoretical wo...

András Antos, Csaba Szepesvári, R&ea...

claim paper

Read More »

183

click to vote

NIPS
2000

127views Information Technology» more NIPS 2000»

Using Free Energies to Represent Q-values in a Multiagent Reinforcement Learning Task

15 years 8 months ago

Download members.chello.at

The problem of reinforcement learning in large factored Markov decision processes is explored. The Q-value of a state-action pair is approximated by the free energy of a product o...

Brian Sallans, Geoffrey E. Hinton

claim paper

Read More »

190

click to vote

NIPS
2007

164views Information Technology» more NIPS 2007»

Incremental Natural Actor-Critic Algorithms

15 years 8 months ago

Download books.nips.cc

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...

Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...

claim paper

Read More »

185

click to vote

ICML
2002
IEEE

113views Machine Learning» more ICML 2002»

Learning from Scarce Experience

16 years 7 months ago

Download www.cs.ucr.edu

Searching the space of policies directly for the optimal policy has been one popular method for solving partially observable reinforcement learning problems. Typically, with each ...

Leonid Peshkin, Christian R. Shelton

claim paper

Read More »

« Prev « First page 6 / 18 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers