Sciweavers

3049 search results - page 473 / 610
» On the Convergence of Bound Optimization Algorithms
Sort
View
ICML
2009
IEEE
16 years 4 months ago
Predictive representations for policy gradient in POMDPs
We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...
Abdeslam Boularias, Brahim Chaib-draa
ICML
2008
IEEE
16 years 4 months ago
Efficiently solving convex relaxations for MAP estimation
The problem of obtaining the maximum a posteriori (map) estimate of a discrete random field is of fundamental importance in many areas of Computer Science. In this work, we build ...
M. Pawan Kumar, Philip H. S. Torr
131
Voted
ICML
1999
IEEE
16 years 4 months ago
Implicit Imitation in Multiagent Reinforcement Learning
Imitation is actively being studied as an effective means of learning in multi-agent environments. It allows an agent to learn how to act well (perhaps optimally) by passively obs...
Bob Price, Craig Boutilier
ICC
2009
IEEE
147views Communications» more  ICC 2009»
15 years 11 months ago
Distributed Quality-Lifetime Maximization in Wireless Video Sensor Networks
—Owing to the availability of low-cost and low-power CMOS cameras, Wireless Video Sensor Networks (WVSN) has recently become a reality. However video encoding is still a costly p...
Eren Gürses, Yuan Lin, Raouf Boutaba
NOMS
2008
IEEE
15 years 10 months ago
Host-aware routing in multicast overlay backbone
— To support large-scale Internet-based broadcast of live streaming video efficiently in content delivery networks (CDNs), it is essential to implement a cost-effective overlay ...
Jun Guo, Sanjay Jha