Search Sciweavers | Sciweavers

44 search results - page 8 / 9

» Sampling Methods for Action Selection in Influence Diagrams

click to vote

IWANN
1999
Springer

115views Neural Networks» more IWANN 1999»

Using Temporal Neighborhoods to Adapt Function Approximators in Reinforcement Learning

13 years 11 months ago

Download www.cs.colostate.edu

To avoid the curse of dimensionality, function approximators are used in reinforcement learning to learn value functions for individual states. In order to make better use of comp...

R. Matthew Kretchmar, Charles W. Anderson

claim paper

Read More »

click to vote

ML
2006
ACM

113views Machine Learning» more ML 2006»

Learning to bid in bridge

13 years 7 months ago

Download www.cs.technion.ac.il

Bridge bidding is considered to be one of the most difficult problems for game-playing programs. It involves four agents rather than two, including a cooperative agent. In additio...

Asaf Amit, Shaul Markovitch

claim paper

Read More »

click to vote

BMCBI
2007

147views more BMCBI 2007»

Statistical significance of quantitative PCR

13 years 7 months ago

Download www.biomedcentral.com

Background: PCR has the potential to detect and precisely quantify specific DNA sequences, but it is not yet often used as a fully quantitative method. A number of data collection...

Yann Karlen, Alan McNair, Sébastien Persegu...

claim paper

Read More »

click to vote

UAI
2000

136views Artificial Intelligence» more UAI 2000»

Fast Planning in Stochastic Games

13 years 8 months ago

Download www.cis.upenn.edu

Stochastic games generalize Markov decision processes MDPs to a multiagent setting by allowing the state transitions to depend jointly on all player actions, and having rewards de...

Michael J. Kearns, Yishay Mansour, Satinder P. Sin...

claim paper

Read More »

click to vote

ICML
2000
IEEE

153views Machine Learning» more ICML 2000»

Eligibility Traces for Off-Policy Policy Evaluation

14 years 8 months ago

Download www.cs.ualberta.ca

Eligibility traces have been shown to speed reinforcement learning, to make it more robust to hidden states, and to provide a link between Monte Carlo and temporal-difference meth...

Doina Precup, Richard S. Sutton, Satinder P. Singh

claim paper

Read More »

« Prev « First page 8 / 9 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers