Search Sciweavers | Sciweavers

81 search results - page 14 / 17

» The Optimal Reward Baseline for Gradient-Based Reinforcement...

224

click to vote

JMLR
2010

119views more JMLR 2010»

A Convergent Online Single Time Scale Actor Critic Algorithm

15 years 2 months ago

Download jmlr.csail.mit.edu

Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...

Dotan Di Castro, Ron Meir

claim paper

Read More »

265

click to vote

PKDD
2010
Springer

164views Data Mining» more PKDD 2010»

Efficient Planning in Large POMDPs through Policy Graph Based Factorized Approximations

15 years 5 months ago

Download users.ics.tkk.fi

Partially observable Markov decision processes (POMDPs) are widely used for planning under uncertainty. In many applications, the huge size of the POMDP state space makes straightf...

Joni Pajarinen, Jaakko Peltonen, Ari Hottinen, Mik...

claim paper

Read More »

173

click to vote

ICAC
2005
IEEE

108views Applied Computing» more ICAC 2005»

Self-Optimizing Architecture for QoS Provisioning in Differentiated Services

16 years 1 months ago

Download csdl2.computer.org

This paper presents a scalable and self-optimizing architecture for Quality-of-Service (QoS) provisioning in the Differentiated Services (DiffServ) framework. The proposed archite...

Daniel Yagan, Chen-Khong Tham

claim paper

Read More »

173

Voted

ML
2002
ACM

133views Machine Learning» more ML 2002»

Finite-time Analysis of the Multiarmed Bandit Problem

15 years 7 months ago

Download homes.dsi.unimi.it

Reinforcement learning policies face the exploration versus exploitation dilemma, i.e. the search for a balance between exploring the environment to find profitable actions while t...

Peter Auer, Nicolò Cesa-Bianchi, Paul Fisch...

claim paper

Read More »

201

click to vote

SASO
2009
IEEE

172views Control Systems» more SASO 2009»

Distributed W-Learning: Multi-Policy Optimization in Self-Organizing Systems

16 years 2 months ago

Download www.scss.tcd.ie

—Large-scale agent-based systems are required to self-optimize towards multiple, potentially conﬂicting, policies of varying spatial and temporal scope. As a result, not all ag...

Ivana Dusparic, Vinny Cahill

claim paper

Read More »

« Prev « First page 14 / 17 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers