Sciweavers

238 search results - page 44 / 48
» Value-Function Approximations for Partially Observable Marko...
Sort
View
AIPS
2008
15 years 5 months ago
HiPPo: Hierarchical POMDPs for Planning Information Processing and Sensing Actions on a Robot
Flexible general purpose robots need to tailor their visual processing to their task, on the fly. We propose a new approach to this within a planning framework, where the goal is ...
Mohan Sridharan, Jeremy L. Wyatt, Richard Dearden
TR
2010
126views Hardware» more  TR 2010»
14 years 10 months ago
Optimal Maintenance Strategies for Wind Turbine Systems Under Stochastic Weather Conditions
Abstract--We examine optimal repair strategies for wind turbines operated under stochastic weather conditions. In-situ sensors installed at wind turbines produce useful information...
Eunshin Byon, Lewis Ntaimo, Yu Ding
ATAL
2003
Springer
15 years 8 months ago
Optimizing information exchange in cooperative multi-agent systems
Decentralized control of a cooperative multi-agent system is the problem faced by multiple decision-makers that share a common set of objectives. The decision-makers may be robots...
Claudia V. Goldman, Shlomo Zilberstein
GECCO
2009
Springer
162views Optimization» more  GECCO 2009»
15 years 1 months ago
Uncertainty handling CMA-ES for reinforcement learning
The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an ada...
Verena Heidrich-Meisner, Christian Igel
ECML
2007
Springer
15 years 9 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber