Search Sciweavers | Sciweavers

337 search results - page 60 / 68

» Mean-Variance Optimization in Markov Decision Processes

133

click to vote

NIPS
2007

146views Information Technology» more NIPS 2007»

Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs

15 years 6 months ago

Download books.nips.cc

We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...

Ambuj Tewari, Peter L. Bartlett

claim paper

Read More »

131

click to vote

NIPS
2001

144views Information Technology» more NIPS 2001»

Variance Reduction Techniques for Gradient Estimates in Reinforcement Learning

15 years 5 months ago

Download jmlr.csail.mit.edu

Policy gradient methods for reinforcement learning avoid some of the undesirable properties of the value function approaches, such as policy degradation (Baxter and Bartlett, 2001...

Evan Greensmith, Peter L. Bartlett, Jonathan Baxte...

claim paper

Read More »

133

click to vote

IJCAI
2003

111views Artificial Intelligence» more IJCAI 2003»

Generalizing Plans to New Environments in Relational MDPs

15 years 5 months ago

Download select.cs.cmu.edu

A longstanding goal in planning research is the ability to generalize plans developed for some set of environments to a new but similar environment, with minimal or no replanning....

Carlos Guestrin, Daphne Koller, Chris Gearhart, Ne...

claim paper

Read More »

153

click to vote

ICML
2007
IEEE

172views Machine Learning» more ICML 2007»

Conditional random fields for multi-agent reinforcement learning

16 years 5 months ago

Download www.machinelearning.org

Conditional random fields (CRFs) are graphical models for modeling the probability of labels given the observations. They have traditionally been trained with using a set of obser...

Xinhua Zhang, Douglas Aberdeen, S. V. N. Vishwanat...

claim paper

Read More »

139

click to vote

GLOBECOM
2007
IEEE

116views Communications» more GLOBECOM 2007»

Cross-Layer Call Admission Control for a CDMA Uplink Employing a Base-Station Antenna Array

15 years 10 months ago

Download post.queensu.ca

— A novel cross-layer call admission control policy is proposed for a general CDMA beamforming system. In contrast to previously proposed call admission control (CAC) policies wh...

Wei Sheng, Steven D. Blostein

claim paper

Read More »

« Prev « First page 60 / 68 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers