Sciweavers

983 search results - page 16 / 197
» A Better Update Policy
Sort
View
NN
2010
Springer
187views Neural Networks» more  NN 2010»
13 years 2 months ago
Efficient exploration through active learning for value function approximation in reinforcement learning
Appropriately designing sampling policies is highly important for obtaining better control policies in reinforcement learning. In this paper, we first show that the least-squares ...
Takayuki Akiyama, Hirotaka Hachiya, Masashi Sugiya...
AAAI
2011
12 years 7 months ago
Differential Eligibility Vectors for Advantage Updating and Gradient Methods
In this paper we propose differential eligibility vectors (DEV) for temporal-difference (TD) learning, a new class of eligibility vectors designed to bring out the contribution of...
Francisco S. Melo
IWAN
2000
Springer
13 years 11 months ago
Two Rule-Based Building-Block Architectures for Policy-Based Network Control
Policy-based networks can be customized by users by injecting programs called policies into the network nodes. So if general-purpose functions can be specified in a policy-based ne...
Yasusi Kanada
ICML
2009
IEEE
14 years 8 months ago
Predictive representations for policy gradient in POMDPs
We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...
Abdeslam Boularias, Brahim Chaib-draa
APSEC
2004
IEEE
13 years 11 months ago
Partitioning of Java Applications to Support Dynamic Updates
The requirement for 24/7 availability of distributed applications complicates their maintenance and evolution as shutting down such applications to perform updates may not be an a...
Robert Pawel Bialek, Eric Jul, Jean-Guy Schneider,...