Sciweavers

200 search results - page 38 / 40
» Point-Based Policy Iteration
Sort
View
MOBIHOC
2007
ACM
14 years 7 months ago
Distributed opportunistic scheduling for ad-hoc communications: an optimal stopping approach
We consider distributed opportunistic scheduling (DOS) in wireless ad-hoc networks, where many links contend for the same channel using random access. In such networks, distribute...
Dong Zheng, Weiyan Ge, Junshan Zhang
COLT
2008
Springer
13 years 9 months ago
Adapting to a Changing Environment: the Brownian Restless Bandits
In the multi-armed bandit (MAB) problem there are k distributions associated with the rewards of playing each of k strategies (slot machine arms). The reward distributions are ini...
Aleksandrs Slivkins, Eli Upfal
NIPS
1996
13 years 8 months ago
Multidimensional Triangulation and Interpolation for Reinforcement Learning
Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...
Scott Davies
WWW
2010
ACM
14 years 2 months ago
Privacy wizards for social networking sites
Privacy is an enormous problem in online social networking sites. While sites such as Facebook allow users fine-grained control over who can see their profiles, it is difficult ...
Lujun Fang, Kristen LeFevre
ICRA
2008
IEEE
167views Robotics» more  ICRA 2008»
14 years 2 months ago
An approximate algorithm for solving oracular POMDPs
Abstract— We propose a new approximate algorithm, LAJIV (Lookahead J-MDP Information Value), to solve Oracular Partially Observable Markov Decision Problems (OPOMDPs), a special ...
Nicholas Armstrong-Crews, Manuela M. Veloso