Search Sciweavers | Sciweavers

181 search results - page 8 / 37

» On Policy Learning in Restricted Policy Spaces

click to vote

ICMLA
2010

203views Machine Learning» more ICMLA 2010»

Multimodal Parameter-exploring Policy Gradients

13 years 5 months ago

Download www6.in.tum.de

Abstract-- Policy Gradients with Parameter-based Exploration (PGPE) is a novel model-free reinforcement learning method that alleviates the problem of high-variance gradient estima...

Frank Sehnke, Alex Graves, Christian Osendorfer, J...

claim paper

Read More »

click to vote

Publication

222views

Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration

14 years 4 months ago

Download arxiv.org

Abstract: Several approximate policy iteration schemes without value functions, which focus on policy representation using classifiers and address policy learning as a supervis...

Christos Dimitrakakis, Michail G. Lagoudakis

posted by olethros

Read More »

click to vote

ATAL
2007
Springer

146views Intelligent Agents» more ATAL 2007»

Transfer via inter-task mappings in policy search reinforcement learning

14 years 1 months ago

Download userweb.cs.utexas.edu

The ambitious goal of transfer learning is to accelerate learning on a target task after training on a different, but related, source task. While many past transfer methods have f...

Matthew E. Taylor, Shimon Whiteson, Peter Stone

claim paper

Read More »

click to vote

GLOBECOM
2006
IEEE

160views Communications» more GLOBECOM 2006»

Adaptive Learning of Transmission Control Policies for MIMO Fading Channels under Delay Constraint

14 years 1 months ago

Download www.ece.ubc.ca

— This paper addresses learning based adaptive resource allocation for wireless MIMO channels with Markovian fading. The problem is posed as Constrained Markov Decision Process w...

Dejan V. Djonin, Vikram Krishnamurthy

claim paper

Read More »

click to vote

PKDD
2010
Springer

164views Data Mining» more PKDD 2010»

Efficient Planning in Large POMDPs through Policy Graph Based Factorized Approximations

13 years 5 months ago

Download users.ics.tkk.fi

Partially observable Markov decision processes (POMDPs) are widely used for planning under uncertainty. In many applications, the huge size of the POMDP state space makes straightf...

Joni Pajarinen, Jaakko Peltonen, Ari Hottinen, Mik...

claim paper

Read More »

« Prev « First page 8 / 37 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers