Sciweavers

1138 search results - page 162 / 228
» Feature Markov Decision Processes
Sort
View
107
Voted
ATAL
2009
Springer
15 years 10 months ago
SarsaLandmark: an algorithm for learning in POMDPs with landmarks
Reinforcement learning algorithms that use eligibility traces, such as Sarsa(λ), have been empirically shown to be effective in learning good estimated-state-based policies in pa...
Michael R. James, Satinder P. Singh
103
Voted
CDC
2008
IEEE
117views Control Systems» more  CDC 2008»
15 years 10 months ago
Event-based optimization for dispatching policies in material handling systems of general assembly lines
—A material handling (MH) system of a general assembly line dispatching parts from inventory to working buffers could be complicated and costly to operate. Generally it is extrem...
Yanjia Zhao, Qianchuan Zhao, Qing-Shan Jia, Xiaoho...
123
Voted
CDC
2008
IEEE
197views Control Systems» more  CDC 2008»
15 years 10 months ago
Dynamic spectrum access policies for cognitive radio
—We study the problem of dynamic spectrum sensing and access in cognitive radio systems as a partially observed Markov decision process (POMDP). A group of cognitive users cooper...
Jayakrishnan Unnikrishnan, Venugopal V. Veeravalli
115
Voted
CDC
2008
IEEE
204views Control Systems» more  CDC 2008»
15 years 10 months ago
Dynamic ping optimization for surveillance in multistatic sonar buoy networks with energy constraints
— In this paper we study the problem of dynamic optimization of ping schedule in an active sonar buoy network deployed to provide persistent surveillance of a littoral area throu...
Anshu Saksena, I-Jeng Wang
135
Voted
ICC
2008
IEEE
109views Communications» more  ICC 2008»
15 years 10 months ago
An MDP-Based Approach for Multipath Data Transmission over Wireless Networks
—Maintaining performance and reliability in wireless networks is a challenging task due to the nature of wireless channels. Multipath data transmission has been used in wired sce...
Vinh Bui, Weiping Zhu, Alessio Botta, Antonio Pesc...