Sciweavers

4485 search results - page 751 / 897
» Equivalences on Observable Processes
Sort
View
152
Voted
ALENEX
2001
105views Algorithms» more  ALENEX 2001»
15 years 5 months ago
A Probabilistic Spell for the Curse of Dimensionality
Range searches in metric spaces can be very di cult if the space is \high dimensional", i.e. when the histogram of distances has a large mean and a small variance. The so-cal...
Edgar Chávez, Gonzalo Navarro
143
Voted
IJCAI
2001
15 years 5 months ago
Exploiting Multiple Secondary Reinforcers in Policy Gradient Reinforcement Learning
Most formulations of Reinforcement Learning depend on a single reinforcement reward value to guide the search for the optimal policy solution. If observation of this reward is rar...
Gregory Z. Grudic, Lyle H. Ungar
135
Voted
IJCAI
2001
15 years 5 months ago
Complexity of Probabilistic Planning under Average Rewards
A general and expressive model of sequential decision making under uncertainty is provided by the Markov decision processes (MDPs) framework. Complex applications with very large ...
Jussi Rintanen
124
Voted
NIPS
2001
15 years 5 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
120
Voted
NIPS
2001
15 years 5 months ago
Orientation-Selective aVLSI Spiking Neurons
We describe a programmable multi-chip VLSI neuronal system that can be used for exploring spike-based information processing models. The system consists of a silicon retina, a PIC...
Shih-Chii Liu, Jörg Kramer, Giacomo Indiveri,...