Search Sciweavers | Sciweavers

2354 search results - page 285 / 471

» Randomness, Stochasticity and Approximations

171

click to vote

CDC
2009
IEEE

147views Control Systems» more CDC 2009»

A simulation-based method for aggregating Markov chains

15 years 11 months ago

Download mechse.illinois.edu

— This paper addresses model reduction for a Markov chain on a large state space. A simulation-based framework is introduced to perform state aggregation of the Markov chain base...

Kun Deng, Prashant G. Mehta, Sean P. Meyn

claim paper

Read More »

167

Voted

CDC
2009
IEEE

132views Control Systems» more CDC 2009»

Q-learning and Pontryagin's Minimum Principle

15 years 11 months ago

Download www.stanford.edu

Abstract— Q-learning is a technique used to compute an optimal policy for a controlled Markov chain based on observations of the system controlled using a non-optimal policy. It ...

Prashant G. Mehta, Sean P. Meyn

claim paper

Read More »

150

click to vote

AIPS
2007

104views Artificial Intelligence» more AIPS 2007»

Discovering Relational Domain Features for Probabilistic Planning

15 years 8 months ago

Download cobweb.ecn.purdue.edu

In sequential decision-making problems formulated as Markov decision processes, state-value function approximation using domain features is a critical technique for scaling up the...

Jia-Hong Wu, Robert Givan

claim paper

Read More »

145

click to vote

NIPS
2007

146views Information Technology» more NIPS 2007»

Anytime Induction of Cost-sensitive Trees

15 years 7 months ago

Download books.nips.cc

Machine learning techniques are increasingly being used to produce a wide-range of classiﬁers for complex real-world applications that involve nonuniform testing costs and miscl...

Saher Esmeir, Shaul Markovitch

claim paper

Read More »

163

click to vote

NIPS
2007

164views Information Technology» more NIPS 2007»

Incremental Natural Actor-Critic Algorithms

15 years 7 months ago

Download books.nips.cc

We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...

Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...

claim paper

Read More »

« Prev « First page 285 / 471 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers