Sciweavers

10 search results - page 2 / 2
» Learning Without State-Estimation in Partially Observable Ma...
Sort
View
CDC
2008
IEEE
197views Control Systems» more  CDC 2008»
14 years 2 months ago
Dynamic spectrum access policies for cognitive radio
—We study the problem of dynamic spectrum sensing and access in cognitive radio systems as a partially observed Markov decision process (POMDP). A group of cognitive users cooper...
Jayakrishnan Unnikrishnan, Venugopal V. Veeravalli
MOBICOM
2009
ACM
14 years 2 months ago
Interference management via rate splitting and HARQ over time-varying fading channels
The coexistence of two unlicensed links is considered, where one link interferes with the transmission of the other, over a timevarying, block-fading channel. In the absence of fa...
Marco Levorato, Osvaldo Simeone, Urbashi Mitra
ICASSP
2008
IEEE
14 years 2 months ago
Bayesian update of dialogue state for robust dialogue systems
This paper presents a new framework for accumulating beliefs in spoken dialogue systems. The technique is based on updating a Bayesian Network that represents the underlying state...
Blaise Thomson, Jost Schatzmann, Steve Young
GECCO
2009
Springer
162views Optimization» more  GECCO 2009»
13 years 6 months ago
Uncertainty handling CMA-ES for reinforcement learning
The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an ada...
Verena Heidrich-Meisner, Christian Igel
ECML
2007
Springer
14 years 2 months ago
Policy Gradient Critics
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
Daan Wierstra, Jürgen Schmidhuber