Sciweavers

599 search results - page 77 / 120
» Online learning by ellipsoid method
Sort
View
IWANN
1999
Springer
14 years 1 months ago
Using Temporal Neighborhoods to Adapt Function Approximators in Reinforcement Learning
To avoid the curse of dimensionality, function approximators are used in reinforcement learning to learn value functions for individual states. In order to make better use of comp...
R. Matthew Kretchmar, Charles W. Anderson
HT
2000
ACM
14 years 1 months ago
Reusable hypertext structures for distance and JIT learning
Software components for distance and just-in-time (JIT) learning are an increasingly common method of encouraging reuse and facilitating the development process[58], but no analog...
Anne Morgan Spalter, Rosemary Michelle Simpson
ICML
2007
IEEE
14 years 9 months ago
Exponentiated gradient algorithms for log-linear structured prediction
Conditional log-linear models are a commonly used method for structured prediction. Efficient learning of parameters in these models is therefore an important problem. This paper ...
Amir Globerson, Terry Koo, Xavier Carreras, Michae...
JMLR
2010
189views more  JMLR 2010»
13 years 3 months ago
Adaptive Step-size Policy Gradients with Average Reward Metric
In this paper, we propose a novel adaptive step-size approach for policy gradient reinforcement learning. A new metric is defined for policy gradients that measures the effect of ...
Takamitsu Matsubara, Tetsuro Morimura, Jun Morimot...
HICSS
2005
IEEE
160views Biometrics» more  HICSS 2005»
14 years 2 months ago
Using Content and Process Scaffolds to Support Collaborative Discourse in Asynchronous Learning Networks
Discourse, a form of collaborative learning [44], is one of the most widely used methods of teaching and learning in the online environment. Particularly in large courses, discour...
I. Wong-Bushby, Starr Roxanne Hiltz, Michael Biebe...