Sciweavers

1227 search results - page 184 / 246
» Learning Rates for Q-Learning
Sort
View
ICML
2010
IEEE
13 years 11 months ago
Clustering processes
The problem of clustering is considered, for the case when each data point is a sample generated by a stationary ergodic process. We propose a very natural asymptotic notion of co...
Daniil Ryabko
ICML
2010
IEEE
13 years 11 months ago
Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design
Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multiarmed bandit problem, where the payoff function is ...
Niranjan Srinivas, Andreas Krause, Sham Kakade, Ma...
AAAI
2010
13 years 10 months ago
The Boosting Effect of Exploratory Behaviors
Active object exploration is one of the hallmarks of human and animal intelligence. Research in psychology has shown that the use of multiple exploratory behaviors is crucial for ...
Jivko Sinapov, Alexander Stoytchev
ICANN
2010
Springer
13 years 10 months ago
Multi-Dimensional Deep Memory Atari-Go Players for Parameter Exploring Policy Gradients
Abstract. Developing superior artificial board-game players is a widelystudied area of Artificial Intelligence. Among the most challenging games is the Asian game of Go, which, des...
Mandy Grüttner, Frank Sehnke, Tom Schaul, J&u...
CORR
2010
Springer
64views Education» more  CORR 2010»
13 years 10 months ago
Selfish Response to Epidemic Propagation
An epidemic spreading in a network calls for a decision on the part of the network members: They should decide whether to protect themselves or not. Their decision depends on the ...
George Theodorakopoulos, Jean-Yves Le Boudec, John...