Sciweavers

1233 search results - page 183 / 247
» Feudal Reinforcement Learning
Sort
View
CDC
2009
IEEE
160views Control Systems» more  CDC 2009»
13 years 6 months ago
Exploring and exploiting routing opportunities in wireless ad-hoc networks
Abstract--In this paper, d-AdaptOR, a distributed opportunistic routing scheme for multi-hop wireless ad-hoc networks is proposed. The proposed scheme utilizes a reinforcement lear...
Abhijeet Bhorkar, Mohammad Naghshvar, Tara Javidi,...
PE
2011
Springer
215views Optimization» more  PE 2011»
13 years 3 months ago
Energy-aware routing in the Cognitive Packet Network
An energy aware routing protocol (EARP) is proposed to minimise a performance metric that combines the total consumed power in the network and the QoS that is specified for the ...
Toktam Mahmoodi
CDC
2010
IEEE
160views Control Systems» more  CDC 2010»
13 years 3 months ago
Adaptive bases for Q-learning
Abstract-- We consider reinforcement learning, and in particular, the Q-learning algorithm in large state and action spaces. In order to cope with the size of the spaces, a functio...
Dotan Di Castro, Shie Mannor
GECCO
2010
Springer
153views Optimization» more  GECCO 2010»
13 years 11 months ago
Multi-task evolutionary shaping without pre-specified representations
Shaping functions can be used in multi-task reinforcement learning (RL) to incorporate knowledge from previously experienced tasks to speed up learning on a new task. So far, rese...
Matthijs Snel, Shimon Whiteson
ATAL
2008
Springer
13 years 10 months ago
Sigma point policy iteration
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...