Search Sciweavers | Sciweavers

We deﬁne the problem of inferring a “mixture of Markov chains” based on observing a stream of interleaved outputs from these chains. We show a sharp characterization of the i...

Tugkan Batu, Sudipto Guha, Sampath Kannan

claim paper

Read More »

149

click to vote

COLT
2004
Springer

76views Machine Learning» more COLT 2004»

Polynomial Time Prediction Strategy with Almost Optimal Mistake Probability

16 years 2 days ago

Download www.cs.technion.ac.il

We give the ﬁrst polynomial time prediction strategy for any PAC-learnable class C that probabilistically predicts the target with mistake probability poly(log(t)) t = ˜O 1 t w...

Nader H. Bshouty

claim paper

Read More »

165

Voted

COLT
2004
Springer

99views Machine Learning» more COLT 2004»

Reinforcement Learning for Average Reward Zero-Sum Games

16 years 2 days ago

Download www.ece.mcgill.ca

Abstract. We consider Reinforcement Learning for average reward zerosum stochastic games. We present and analyze two algorithms. The ﬁrst is based on relative Q-learning and the ...

Shie Mannor

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers