Sciweavers

147 search results - page 20 / 30
» Policy Gradient in Continuous Time
Sort
View
ML
2006
ACM
13 years 9 months ago
Universal parameter optimisation in games based on SPSA
Most game programs have a large number of parameters that are crucial for their performance. While tuning these parameters by hand is rather difficult, efficient and easy to use ge...
Levente Kocsis, Csaba Szepesvári
ICASSP
2009
IEEE
14 years 4 months ago
Extended VTS for noise-robust speech recognition
Model compensation is a standard way of improving the robustness of speech recognition systems to noise. A number of popular schemes are based on vector Taylor series (vts) compen...
Rogier C. van Dalen, Mark J. F. Gales
CVPR
2010
IEEE
14 years 1 months ago
Discontinuous Seam-Carving for Video Retargeting
We introduce a new algorithm for video retargeting that uses discontinuous seam-carving in both space and time for resizing videos. Our algorithm relies on a novel appearance-base...
Matthias Grundmann, Vivek Kwatra, Mei Han, Irfan E...
ICRA
2006
IEEE
161views Robotics» more  ICRA 2006»
14 years 3 months ago
Quadruped Robot Obstacle Negotiation via Reinforcement Learning
— Legged robots can, in principle, traverse a large variety of obstacles and terrains. In this paper, we describe a successful application of reinforcement learning to the proble...
Honglak Lee, Yirong Shen, Chih-Han Yu, Gurjeet Sin...
NIPS
2003
13 years 11 months ago
Gaussian Processes in Reinforcement Learning
We exploit some useful properties of Gaussian process (GP) regression models for reinforcement learning in continuous state spaces and discrete time. We demonstrate how the GP mod...
Carl Edward Rasmussen, Malte Kuss