Sciweavers

313 search results - page 11 / 63
» Consistent Approximations and Approximate Functions and Grad...
Sort
View
JCP
2007
143views more  JCP 2007»
13 years 8 months ago
Noisy K Best-Paths for Approximate Dynamic Programming with Application to Portfolio Optimization
Abstract— We describe a general method to transform a non-Markovian sequential decision problem into a supervised learning problem using a K-bestpaths algorithm. We consider an a...
Nicolas Chapados, Yoshua Bengio
SAC
2009
ACM
14 years 3 months ago
A gradient oriented recombination scheme for evolution strategies
This paper proposes a novel recombination scheme for evolutionary algorithms, which can guide the new population generation towards the maximum increase of the objective function....
Haifeng Chen, Guofei Jiang
AAAI
2008
13 years 11 months ago
Adaptive Importance Sampling with Automatic Model Selection in Value Function Approximation
Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past, which is an essential problem for physically grounded AI as experiments are us...
Hirotaka Hachiya, Takayuki Akiyama, Masashi Sugiya...
UAI
2008
13 years 10 months ago
Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping
We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...
Richard S. Sutton, Csaba Szepesvári, Alborz...
COCO
1994
Springer
140views Algorithms» more  COCO 1994»
14 years 1 months ago
Random Debaters and the Hardness of Approximating Stochastic Functions
A probabilistically checkable debate system (PCDS) for a language L consists of a probabilisticpolynomial-time veri er V and a debate between Player 1, who claims that the input x ...
Anne Condon, Joan Feigenbaum, Carsten Lund, Peter ...