Search Sciweavers | Sciweavers

86 search results - page 7 / 18

» Estimation and Approximation Bounds for Gradient-Based Reinf...

191

click to vote

JUCS
2007

98views more JUCS 2007»

Focus of Attention in Reinforcement Learning

15 years 6 months ago

Download www.research.rutgers.edu

Abstract: Classiﬁcation-based reinforcement learning (RL) methods have recently been proposed as an alternative to the traditional value-function based methods. These methods use...

Lihong Li, Vadim Bulitko, Russell Greiner

claim paper

Read More »

171

click to vote

AIPS
2008

95views Artificial Intelligence» more AIPS 2008»

Learning Heuristic Functions through Approximate Linear Programming

15 years 9 months ago

Download anytime.cs.umass.edu

Planning problems are often formulated as heuristic search. The choice of the heuristic function plays a significant role in the performance of planning systems, but a good heuris...

Marek Petrik, Shlomo Zilberstein

claim paper

Read More »

194

click to vote

NIPS
1996

112views Information Technology» more NIPS 1996»

Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning

15 years 8 months ago

Download www.ri.cmu.edu

Model learning combined with dynamic programming has been shown to be e ective for learning control of continuous state dynamic systems. The simplest method assumes the learned mod...

Jeff G. Schneider

claim paper

Read More »

162

click to vote

COLT
2004
Springer

109views Machine Learning» more COLT 2004»

Concentration Bounds for Unigrams Language Model

16 years 12 days ago

Download www.math.tau.ac.il

Abstract. We show several PAC-style concentration bounds for learning unigrams language model. One interesting quantity is the probability of all words appearing exactly k times in...

Evgeny Drukh, Yishay Mansour

claim paper

Read More »

166

click to vote

AAAI
2008

207views Intelligent Agents» more AAAI 2008»

Adaptive Importance Sampling with Automatic Model Selection in Value Function Approximation

15 years 9 months ago

Download sugiyama-www.cs.titech.ac.jp

Off-policy reinforcement learning is aimed at efficiently reusing data samples gathered in the past, which is an essential problem for physically grounded AI as experiments are us...

Hirotaka Hachiya, Takayuki Akiyama, Masashi Sugiya...

claim paper

Read More »

« Prev « First page 7 / 18 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers