Sciweavers

51 search results - page 8 / 11
» Exponentiated Gradient Methods for Reinforcement Learning
Sort
View
ESANN
2007
13 years 9 months ago
Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning
In this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natur...
Jan Peters, Stefan Schaal
JMLR
2006
124views more  JMLR 2006»
13 years 7 months ago
Policy Gradient in Continuous Time
Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...
Rémi Munos
ICML
2001
IEEE
14 years 8 months ago
Off-Policy Temporal Difference Learning with Function Approximation
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Doina Precup, Richard S. Sutton, Sanjoy Dasgupta
ANOR
2005
80views more  ANOR 2005»
13 years 7 months ago
Entropic Penalties in Finite Games
The main objects here are finite-strategy games in which entropic terms are subtracted from the payoffs. After such subtraction each Nash equilibrium solves an explicit, unconstra...
Sjur Didrik Flåm, E. Cavazzuti
ECIR
2010
Springer
13 years 5 months ago
Maximum Margin Ranking Algorithms for Information Retrieval
Abstract. Machine learning ranking methods are increasingly applied to ranking tasks in information retrieval (IR). However ranking tasks in IR often differ from standard ranking t...
Shivani Agarwal, Michael Collins