Sciweavers

983 search results - page 159 / 197
» A Better Update Policy
Sort
View
NIPS
2007
13 years 10 months ago
Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs
We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...
Ambuj Tewari, Peter L. Bartlett
UAI
2008
13 years 10 months ago
Hierarchical POMDP Controller Optimization by Likelihood Maximization
Planning can often be simplified by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. [4] recently showed that the hierarchy discovery problem can be...
Marc Toussaint, Laurent Charlin, Pascal Poupart
SEC
2007
13 years 10 months ago
Building a Distributed Semantic-aware Security Architecture
Enhancing the service-oriented architecture paradigm with semantic components is a new field of research and goal of many ongoing projects. The results lead to more powerful web a...
Jan Kolter, Rolf Schillinger, Günther Pernul
AAAI
2006
13 years 10 months ago
Sample-Efficient Evolutionary Function Approximation for Reinforcement Learning
Reinforcement learning problems are commonly tackled with temporal difference methods, which attempt to estimate the agent's optimal value function. In most real-world proble...
Shimon Whiteson, Peter Stone
ISCAPDCS
2004
13 years 10 months ago
An Adaptive OpenMP Loop Scheduler for Hyperthreaded SMPs
Hyperthreaded(HT) and simultaneous multithreaded (SMT) processors are now available in commodity workstations and servers. This technology is designed to increase throughput by ex...
Yun Zhang, Mihai Burcea, Victor Cheng, Ron Ho, Mic...