Sciweavers

2354 search results - page 285 / 471
» Randomness, Stochasticity and Approximations
Sort
View
CDC
2009
IEEE
147views Control Systems» more  CDC 2009»
15 years 11 months ago
A simulation-based method for aggregating Markov chains
— This paper addresses model reduction for a Markov chain on a large state space. A simulation-based framework is introduced to perform state aggregation of the Markov chain base...
Kun Deng, Prashant G. Mehta, Sean P. Meyn
167
Voted
CDC
2009
IEEE
132views Control Systems» more  CDC 2009»
15 years 11 months ago
Q-learning and Pontryagin's Minimum Principle
Abstract— Q-learning is a technique used to compute an optimal policy for a controlled Markov chain based on observations of the system controlled using a non-optimal policy. It ...
Prashant G. Mehta, Sean P. Meyn
AIPS
2007
15 years 8 months ago
Discovering Relational Domain Features for Probabilistic Planning
In sequential decision-making problems formulated as Markov decision processes, state-value function approximation using domain features is a critical technique for scaling up the...
Jia-Hong Wu, Robert Givan
NIPS
2007
15 years 7 months ago
Anytime Induction of Cost-sensitive Trees
Machine learning techniques are increasingly being used to produce a wide-range of classifiers for complex real-world applications that involve nonuniform testing costs and miscl...
Saher Esmeir, Shaul Markovitch
NIPS
2007
15 years 7 months ago
Incremental Natural Actor-Critic Algorithms
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...