Sciweavers

1912 search results - page 217 / 383
» Optimizing interconnection policies
Sort
View
UAI
2004
15 years 5 months ago
Heuristic Search Value Iteration for POMDPs
We present a novel POMDP planning algorithm called heuristic search value iteration (HSVI). HSVI is an anytime algorithm that returns a policy and a provable bound on its regret w...
Trey Smith, Reid G. Simmons
JASSS
1998
82views more  JASSS 1998»
15 years 4 months ago
Qualitative modeling and simulation of socio-economic phenomena
This paper describes an application of recently developed qualitative reasoning techniques to complex, socio{economic allocation problems. We explain why we believe traditional op...
Giorgio Brajnik, Marji Lines
ISAAC
2010
Springer
243views Algorithms» more  ISAAC 2010»
15 years 2 months ago
Lower Bounds for Howard's Algorithm for Finding Minimum Mean-Cost Cycles
Howard's policy iteration algorithm is one of the most widely used algorithms for finding optimal policies for controlling Markov Decision Processes (MDPs). When applied to we...
Thomas Dueholm Hansen, Uri Zwick
ICONIP
2009
15 years 2 months ago
Tracking in Reinforcement Learning
Reinforcement learning induces non-stationarity at several levels. Adaptation to non-stationary environments is of course a desired feature of a fair RL algorithm. Yet, even if the...
Matthieu Geist, Olivier Pietquin, Gabriel Fricout
CDC
2010
IEEE
112views Control Systems» more  CDC 2010»
14 years 11 months ago
Dynamic product assembly and inventory control for maximum profit
We consider a manufacturing plant that purchases raw materials for product assembly and then sells the final products to customers. There are M types of raw materials and K types o...
Michael J. Neely, Longbo Huang