Search Sciweavers | Sciweavers

238 search results - page 10 / 48

» Value-Function Approximations for Partially Observable Marko...

181

Voted

ICML
2009
IEEE

148views Machine Learning» more ICML 2009»

Predictive representations for policy gradient in POMDPs

16 years 7 months ago

Download damas.ift.ulaval.ca

We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...

Abdeslam Boularias, Brahim Chaib-draa

claim paper

Read More »

171

click to vote

CDC
2008
IEEE

118views Control Systems» more CDC 2008»

A density projection approach to dimension reduction for continuous-state POMDPs

16 years 1 months ago

Download netfiles.uiuc.edu

Abstract— Research on numerical solution methods for partially observable Markov decision processes (POMDPs) has primarily focused on discrete-state models, and these algorithms ...

Enlu Zhou, Michael C. Fu, Steven I. Marcus

claim paper

Read More »

186

click to vote

DATE
2007
IEEE

133views Hardware» more DATE 2007»

Stochastic modeling and optimization for robust power management in a partially observable system

16 years 1 months ago

Download www.date-conference.com

As the hardware and software complexity grows, it is unlikely for the power management hardware/software to have a full observation of the entire system status. In this paper, we ...

Qinru Qiu, Ying Tan, Qing Wu

claim paper

Read More »

140

click to vote

ANOR
2010

85views more ANOR 2010»

Inventory management with partially observed nonstationary demand

15 years 7 months ago

Download www.pstat.ucsb.edu

Abstract. We consider a continuous-time model for inventory management with Markov modulated non-stationary demands. We introduce active learning by assuming that the state of the ...

Erhan Bayraktar, Michael Ludkovski

claim paper

Read More »

206

click to vote

FOCS
2007
IEEE

157views Theoretical Computer Science» more FOCS 2007»

Approximation Algorithms for Partial-Information Based Stochastic Control with Markovian Rewards

16 years 1 months ago

Download www.cis.upenn.edu

We consider a variant of the classic multi-armed bandit problem (MAB), which we call FEEDBACK MAB, where the reward obtained by playing each of n independent arms varies according...

Sudipto Guha, Kamesh Munagala

claim paper

Read More »

« Prev « First page 10 / 48 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers