Search Sciweavers | Sciweavers

209 search results - page 34 / 42

» Optimization and Convergence of Observation Channels in Stoc...

135

click to vote

ICASSP
2011
IEEE

102views Signal Processing» more ICASSP 2011»

Social norm and long-run learning in peer-to-peer networks

14 years 9 months ago

Download mirlab.org

We start by formulating the resource sharing in peer-to-peer (P2P) networks as a random-matching gift-giving game, where self-interested peers aim at maximizing their own long-ter...

Yu Zhang, Mihaela van der Schaar

claim paper

Read More »

149

click to vote

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

15 years 7 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

181

click to vote

CAD
2004
Springer

99views Theoretical Computer Science» more CAD 2004»

Control point adjustment for B-spline curve approximation

15 years 5 months ago

Download i.cs.hku.hk

Pottmann et al. propose an iterative optimization scheme for approximating a target curve with a B-spline curve based on square distance minimization, or SDM. The main advantage o...

Huaiping Yang, Wenping Wang, Jia-Guang Sun

claim paper

Read More »

151

Voted

ML
1998
ACM

101views Machine Learning» more ML 1998»

Elevator Group Control Using Multiple Reinforcement Learning Agents

15 years 5 months ago

Download www.clear.rice.edu

Recent algorithmic and theoretical advances in reinforcement learning (RL) have attracted widespread interest. RL algorithmshave appeared that approximatedynamic programming on an ...

Robert H. Crites, Andrew G. Barto

claim paper

Read More »

129

click to vote

ICC
2009
IEEE

127views Communications» more ICC 2009»

Security Games with Incomplete Information

16 years 16 days ago

Download www.tansu.alpcan.org

—We study two-player security games which can be viewed as sequences of nonzero-sum matrix games where at each stage of the iterations the players make imperfect observations of ...

Kien C. Nguyen, Tansu Alpcan, Tamer Basar

claim paper

Read More »

« Prev « First page 34 / 42 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers