Sciweavers

98 search results - page 14 / 20
» Using Rewards for Belief State Updates in Partially Observab...
Sort
View
NECO
2007
150views more  NECO 2007»
13 years 7 months ago
Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule
Learning agents, whether natural or artificial, must update their internal parameters in order to improve their behavior over time. In reinforcement learning, this plasticity is ...
Dorit Baras, Ron Meir
AAAI
2006
13 years 9 months ago
Point-based Dynamic Programming for DEC-POMDPs
We introduce point-based dynamic programming (DP) for decentralized partially observable Markov decision processes (DEC-POMDPs), a new discrete DP algorithm for planning strategie...
Daniel Szer, François Charpillet
FLAIRS
2001
13 years 9 months ago
Probabilistic Planning for Behavior-Based Robots
Partially Observable Markov Decision Process models (POMDPs) have been applied to low-level robot control. We show how to use POMDPs differently, namely for sensorplanning in the ...
Amin Atrash, Sven Koenig
ISIPTA
2005
IEEE
161views Mathematics» more  ISIPTA 2005»
14 years 1 months ago
Decision making under incomplete data using the imprecise Dirichlet model
The paper presents an efficient solution to decision problems where direct partial information on the distribution of the states of nature is available, either by observations of ...
Lev V. Utkin, Thomas Augustin
VTC
2008
IEEE
185views Communications» more  VTC 2008»
14 years 2 months ago
Opportunistic Spectrum Access for Energy-Constrained Cognitive Radios
This paper considers a scenario in which a secondary user makes opportunistic use of a channel allocated to some primary network. The primary network operates in a time-slotted ma...
Anh Tuan Hoang, Ying-Chang Liang, David Tung Chong...