Sciweavers

451 search results - page 37 / 91
» Temporal Rewards for Performance Evaluation
Sort
View
ATAL
2010
Springer
15 years 5 months ago
Closing the learning-planning loop with predictive state representations
A central problem in artificial intelligence is to choose actions to maximize reward in a partially observable, uncertain environment. To do so, we must learn an accurate model of ...
Byron Boots, Sajid M. Siddiqi, Geoffrey J. Gordon
CVPR
2009
IEEE
15 years 11 months ago
Contextualizing histogram
In this paper, we investigate how to incorporate spatial and/or temporal contextual information into classical histogram features with the aim of boosting visual classification p...
Bingbing Ni, Shuicheng Yan, Ashraf A. Kassim
ATAL
2010
Springer
15 years 4 months ago
PAC-MDP learning with knowledge-based admissible models
PAC-MDP algorithms approach the exploration-exploitation problem of reinforcement learning agents in an effective way which guarantees that with high probability, the algorithm pe...
Marek Grzes, Daniel Kudenko
AIPS
2009
15 years 5 months ago
Using Distance Estimates in Heuristic Search
This paper explores the use of an oft-ignored information source in heuristic search: a search-distance-to-go estimate. Operators frequently have different costs and cost-to-go is...
Jordan Tyler Thayer, Wheeler Ruml
NCA
2008
IEEE
15 years 4 months ago
Neurodynamic programming: a case study of the traveling salesman problem
The paper focuses on the study of solving the large-scale traveling salesman problem (TSP) based on neurodynamic programming. From this perspective, two methods, temporal differenc...
Jia Ma, Tao Yang, Zeng-Guang Hou, Min Tan, Derong ...