Sciweavers

1167 search results - page 44 / 234
» policy 2007
Sort
View
ICANN
2007
Springer
14 years 5 months ago
Solving Deep Memory POMDPs with Recurrent Policy Gradients
Abstract. This paper presents Recurrent Policy Gradients, a modelfree reinforcement learning (RL) method creating limited-memory stochastic policies for partially observable Markov...
Daan Wierstra, Alexander Förster, Jan Peters,...
WSC
2007
14 years 1 months ago
Simulation of scheduled ordering policies in distribution supply chains
In this paper we study a decentralized distribution supply chain with one supplier and many newsvendor-type retailers that face exogenous end-customer demands. Using total supply ...
Lucy G. Chen, Srinagesh Gavirneni
IJCAI
2007
14 years 6 days ago
Using Learned Policies in Heuristic-Search Planning
Many current state-of-the-art planners rely on forward heuristic search. The success of such search typically depends on heuristic distance-to-the-goal estimates derived from the ...
Sung Wook Yoon, Alan Fern, Robert Givan
POLICY
2007
Springer
14 years 4 months ago
A Socio-cognitive Approach to Modeling Policies in Open Environments
The richness of today’s electronic communications mirrors physical world: activities such as shopping, business and scientific collaboration are conducted online. Current intera...
Tatyana Ryutov
VTC
2007
IEEE
110views Communications» more  VTC 2007»
14 years 5 months ago
Multi-Channel Radio Resource Distribution Policies in Heterogeneous Traffic Scenarios
—Multi-channel operation in wireless systems has been proposed to increase user throughput and reduce transmission delays. However, multi-channel operation requires adequate reso...
M. Carmen Lucas-Estan, Javier Gozálvez, Joa...