Sciweavers

94 search results - page 15 / 19
» Sequential cost-sensitive decision making with reinforcement...
Sort
View
ECML
2005
Springer
14 years 1 months ago
Using Rewards for Belief State Updates in Partially Observable Markov Decision Processes
Partially Observable Markov Decision Processes (POMDP) provide a standard framework for sequential decision making in stochastic environments. In this setting, an agent takes actio...
Masoumeh T. Izadi, Doina Precup
COLT
2007
Springer
14 years 1 months ago
Minimax Bounds for Active Learning
This paper analyzes the potential advantages and theoretical challenges of “active learning” algorithms. Active learning involves sequential sampling procedures that use infor...
Rui Castro, Robert D. Nowak
IAT
2010
IEEE
13 years 5 months ago
Selecting Operator Queries Using Expected Myopic Gain
When its human operator cannot continuously supervise (much less teleoperate) an agent, the agent should be able to recognize its limitations and ask for help when it risks making...
Robert Cohn, Michael Maxim, Edmund H. Durfee, Sati...
CORR
2010
Springer
152views Education» more  CORR 2010»
13 years 7 months ago
Neuroevolutionary optimization
Temporal difference methods are theoretically grounded and empirically effective methods for addressing reinforcement learning problems. In most real-world reinforcement learning ...
Eva Volná
ECCV
2010
Springer
13 years 11 months ago
Discriminative Tracking by Metric Learning
We present a discriminative model that casts appearance modeling and visual matching into a single objective for visual tracking. Most previous discriminative models for visual tra...