Sciweavers

ICML
1995
IEEE
15 years 14 days ago
Stable Function Approximation in Dynamic Programming
The success ofreinforcement learninginpractical problems depends on the ability to combine function approximation with temporal di erence methods such as value iteration. Experime...
Geoffrey J. Gordon
ICML
1995
IEEE
15 years 14 days ago
Ant-Q: A Reinforcement Learning Approach to the Traveling Salesman Problem
In this paper we introduce Ant-Q, a family of algorithms which present many similarities with Q-learning (Watkins, 1989), and which we apply to the solution of symmetric and asymm...
Luca Maria Gambardella, Marco Dorigo
ICML
1995
IEEE
15 years 14 days ago
Visualizing High-Dimensional Structure with the Incremental Grid Growing Neural Network
Understanding high-dimensional real world data usually requires learning the structure of the data space. The structure maycontain high-dimensional clusters that are related in co...
Justine Blackmore, Risto Miikkulainen
ICML
1995
IEEE
15 years 14 days ago
Residual Algorithms: Reinforcement Learning with Function Approximation
A number of reinforcement learning algorithms have been developed that are guaranteed to converge to the optimal solution when used with lookup tables. It is shown, however, that ...
Leemon C. Baird III
ICML
1996
IEEE
15 years 14 days ago
Applying the Multiple Cause Mixture Model to Text Categorization
Mehran Sahami, Marti A. Hearst, Eric Saund
ICML
1996
IEEE
15 years 14 days ago
Representing and Learning Quality-Improving Search Control Knowledge
Generating good, production-quality plans is an essential element in transforming planners from research tools into real-world applications, but one that has been frequently overl...
M. Alicia Pérez
ICML
1996
IEEE
15 years 14 days ago
Unsupervised Learning Using MML
This paper discusses the unsupervised learning problem. An important part of the unsupervised learning problem is determining the numberofconstituent groups (componentsor classes)...
Jonathan J. Oliver, Rohan A. Baxter, Chris S. Wall...
ICML
1996
IEEE
15 years 14 days ago
Searching for Structure in Multiple Streams of Data
Finding structure in multiple streams of data is an important problem. Consider the streams of data owing from a robot's sensors, the monitors in an intensive care unit, or p...
Tim Oates, Paul R. Cohen
ICML
1996
IEEE
15 years 14 days ago
Sensitive Discount Optimality: Unifying Discounted and Average Reward Reinforcement Learning
Research in reinforcementlearning (RL)has thus far concentrated on two optimality criteria: the discounted framework, which has been very well-studied, and the averagereward frame...
Sridhar Mahadevan