Sciweavers

797 search results - page 69 / 160
» Timed Control with Partial Observability
Sort
View
NIPS
2001
13 years 11 months ago
Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning
Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...
Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...
MOBIHOC
2009
ACM
14 years 10 months ago
Admission control and scheduling for QoS guarantees for variable-bit-rate applications on wireless channels
Providing differentiated Quality of Service (QoS) over unreliable wireless channels is an important challenge for supporting several future applications. We analyze a model that h...
I-Hong Hou, P. R. Kumar
CORR
2007
Springer
73views Education» more  CORR 2007»
13 years 10 months ago
Universal Reinforcement Learning
—We consider an agent interacting with an unmodeled environment. At each time, the agent makes an observation, takes an action, and incurs a cost. Its actions can influence futu...
Vivek F. Farias, Ciamac Cyrus Moallemi, Tsachy Wei...
FSTTCS
2005
Springer
14 years 3 months ago
The MSO Theory of Connectedly Communicating Processes
Abstract. We identify a network of sequential processes that communicate by synchronizing frequently on common actions. More precisely, we demand that there is a bound k such that ...
P. Madhusudan, P. S. Thiagarajan, Shaofa Yang
PLDI
1993
ACM
14 years 2 months ago
Dependence-Based Program Analysis
Program analysis and optimizationcan be speeded upthrough the use of the dependence flow graph (DFG), a representation of program dependences which generalizes def-use chains and...
Richard Johnson, Keshav Pingali