Sciweavers

1237 search results - page 203 / 248
» Simulation sampling with live-points
Sort
View
NIPS
2001
13 years 9 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
NIPS
2001
13 years 9 months ago
Model-Free Least-Squares Policy Iteration
We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...
Michail G. Lagoudakis, Ronald Parr
GRAPHICSINTERFACE
2000
13 years 9 months ago
Adaptive Representation of Specular Light Flux
Caustics produce beautiful and intriguing illumination patterns. However, their complex behavior make them difficult to simulate accurately in all but the simplest configurations....
Normand Brière, Pierre Poulin
AAAI
1996
13 years 9 months ago
A Clinician's Tool for Analyzing Non-Compliance
We describe a computer program to assist a clinician with assessing the e cacy of treatments in experimental studies for which treatment assignment is random but subject complianc...
David Maxwell Chickering, Judea Pearl
WCE
2007
13 years 9 months ago
Bootstrap Confidence Interval for the Median Failure Time of Three-Parameter Weibull Distribution
— In many applications of failure time data analysis, it is important to perform inferences about the median of the distribution function in situations of failure time data model...
N. A. Ibrahim, A. Kudus