Sciweavers

1016 search results - page 110 / 204
» Explore or Exploit
Sort
View
RAS
2010
131views more  RAS 2010»
13 years 7 months ago
Probabilistic Policy Reuse for inter-task transfer learning
Policy Reuse is a reinforcement learning technique that efficiently learns a new policy by using past similar learned policies. The Policy Reuse learner improves its exploration b...
Fernando Fernández, Javier García, M...
CORR
2004
Springer
103views Education» more  CORR 2004»
13 years 9 months ago
Online convex optimization in the bandit setting: gradient descent without a gradient
We study a general online convex optimization problem. We have a convex set S and an unknown sequence of cost functions c1, c2, . . . , and in each period, we choose a feasible po...
Abraham Flaxman, Adam Tauman Kalai, H. Brendan McM...
QEST
2005
IEEE
14 years 2 months ago
Toward Picture-perfect Streaming on the Internet
Quality of service (QoS) in streaming of continuous media over the Internet is poor, which is partly due to variations in delays, bandwidth limitations, and packet losses. Althoug...
Alix L. H. Chow, Leana Golubchik, John C. S. Lui
IJCAI
2007
13 years 10 months ago
Adaptive Genetic Algorithm with Mutation and Crossover Matrices
A matrix formulation for an adaptive genetic algorithm is developed using mutation matrix and crossover matrix. Selection, mutation, and crossover are all parameter-free in the se...
Nga Lam Law, Kwok Yip Szeto
MEDINFO
2007
13 years 10 months ago
Text Categorization Models for Identifying Unproven Cancer Treatments on the Web
The nature of the internet as a non-peer-reviewed (and more generally largely unregulated) publication medium has allowed wide-spread promotion of inaccurate and unproven medical ...
Yin Aphinyanaphongs, Constantin F. Aliferis