Search Sciweavers | Sciweavers

1233 search results - page 183 / 247

» Feudal Reinforcement Learning

177

click to vote

CDC
2009
IEEE

160views Control Systems» more CDC 2009»

Exploring and exploiting routing opportunities in wireless ad-hoc networks

15 years 3 months ago

Download circuit.ucsd.edu

Abstract--In this paper, d-AdaptOR, a distributed opportunistic routing scheme for multi-hop wireless ad-hoc networks is proposed. The proposed scheme utilizes a reinforcement lear...

Abhijeet Bhorkar, Mohammad Naghshvar, Tara Javidi,...

claim paper

Read More »

173

click to vote

PE
2011
Springer

215views Optimization» more PE 2011»

Energy-aware routing in the Cognitive Packet Network

15 years 24 days ago

Download san.ee.ic.ac.uk

An energy aware routing protocol (EARP) is proposed to minimise a performance metric that combines the total consumed power in the network and the QoS that is speciﬁed for the �...

Toktam Mahmoodi

claim paper

Read More »

165

click to vote

CDC
2010
IEEE

160views Control Systems» more CDC 2010»

Adaptive bases for Q-learning

15 years 23 days ago

Download webee.technion.ac.il

Abstract-- We consider reinforcement learning, and in particular, the Q-learning algorithm in large state and action spaces. In order to cope with the size of the spaces, a functio...

Dotan Di Castro, Shie Mannor

claim paper

Read More »

174

click to vote

GECCO
2010
Springer

153views Optimization» more GECCO 2010»

Multi-task evolutionary shaping without pre-specified representations

15 years 9 months ago

Download www.science.uva.nl

Shaping functions can be used in multi-task reinforcement learning (RL) to incorporate knowledge from previously experienced tasks to speed up learning on a new task. So far, rese...

Matthijs Snel, Shimon Whiteson

claim paper

Read More »

159

click to vote

ATAL
2008
Springer

123views Intelligent Agents» more ATAL 2008»

Sigma point policy iteration

15 years 7 months ago

Download web.mit.edu

In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...

Michael H. Bowling, Alborz Geramifard, David Winga...

claim paper

Read More »

« Prev « First page 183 / 247 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers