Search Sciweavers | Sciweavers

185 search results - page 34 / 37

» Simulation-Based Optimization Algorithms for Finite-Horizon ...

120

Voted

AAAI
2010

185views Intelligent Agents» more AAAI 2010»

PUMA: Planning Under Uncertainty with Macro-Actions

15 years 5 months ago

Download www.cs.berkeley.edu

Planning in large, partially observable domains is challenging, especially when a long-horizon lookahead is necessary to obtain a good policy. Traditional POMDP planners that plan...

Ruijie He, Emma Brunskill, Nicholas Roy

claim paper

Read More »

149

Voted

JMLR
2006

190views more JMLR 2006»

Causal Graph Based Decomposition of Factored MDPs

15 years 3 months ago

Download www-anw.cs.umass.edu

We present Variable Influence Structure Analysis, or VISA, an algorithm that performs hierarchical decomposition of factored Markov decision processes. VISA uses a dynamic Bayesia...

Anders Jonsson, Andrew G. Barto

claim paper

Read More »

144

Voted

ICML
2007
IEEE

172views Machine Learning» more ICML 2007»

Conditional random fields for multi-agent reinforcement learning

16 years 4 months ago

Download www.machinelearning.org

Conditional random fields (CRFs) are graphical models for modeling the probability of labels given the observations. They have traditionally been trained with using a set of obser...

Xinhua Zhang, Douglas Aberdeen, S. V. N. Vishwanat...

claim paper

Read More »

133

Voted

QUESTA
2010

112views more QUESTA 2010»

Admission control for a multi-server queue with abandonment

15 years 2 months ago

Download www-bcf.usc.edu

In a M/M/N+M queue, when there are many customers waiting, it may be preferable to reject a new arrival rather than risk that arrival later abandoning without receiving service. O...

Yasar Levent Koçaga, Amy R. Ward

claim paper

Read More »

129

Voted

ICML
2009
IEEE

148views Machine Learning» more ICML 2009»

Predictive representations for policy gradient in POMDPs

16 years 4 months ago

Download damas.ift.ulaval.ca

We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...

Abdeslam Boularias, Brahim Chaib-draa

claim paper

Read More »

« Prev « First page 34 / 37 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers