Sciweavers

200 search results - page 31 / 40
» Point-Based Policy Iteration
Sort
View
ICML
2010
IEEE
13 years 8 months ago
Convergence of Least Squares Temporal Difference Methods Under General Conditions
We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...
Huizhen Yu
GLOBECOM
2008
IEEE
13 years 7 months ago
Nonlinear Quadratic Pricing for Concavifiable Utilities in Network Rate Control
This paper deals with a category of concavifiable functions that can be used to model inelastic traffic in the network. Such class of functions can be concavified within an interva...
Quanyan Zhu, Raouf Boutaba
FORMATS
2007
Springer
13 years 11 months ago
Combining Formal Verification with Observed System Execution Behavior to Tune System Parameters
Resource limited DRE (Distributed Real-time Embedded) systems can benefit greatly from dynamic adaptation of system parameters. We propose a novel approach that employs iterative t...
Minyoung Kim, Mark-Oliver Stehr, Carolyn L. Talcot...
IEEEARES
2007
IEEE
14 years 1 months ago
Formalising Dynamic Trust Negotiations in Decentralised Collaborative e-Health Systems
Access control in decentralised collaborative systems present huge challenges especially where many autonomous entities including organisations, humans, software agents from diff...
Oluwafemi Ajayi, Richard O. Sinnott, Anthony Stell
IJCAI
2003
13 years 9 months ago
A Planning Algorithm for Predictive State Representations
We address the problem of optimally controlling stochastic environments that are partially observable. The standard method for tackling such problems is to define and solve a Part...
Masoumeh T. Izadi, Doina Precup