Search Sciweavers | Sciweavers

60 search results - page 9 / 12

» Iteratively Extending Time Horizon Reinforcement Learning

172

click to vote

EWRL
2008

129views Machine Learning» more EWRL 2008»

Markov Decision Processes with Arbitrary Reward Processes

15 years 8 months ago

Download www.cim.mcgill.ca

Abstract. We consider a control problem where the decision maker interacts with a standard Markov decision process with the exception that the reward functions vary arbitrarily ove...

Jia Yuan Yu, Shie Mannor, Nahum Shimkin

claim paper

Read More »

202

click to vote

ATAL
2006
Springer

147views Intelligent Agents» more ATAL 2006»

Learning to cooperate in multi-agent social dilemmas

15 years 10 months ago

Download sequel.futurs.inria.fr

In many Multi-Agent Systems (MAS), agents (even if selfinterested) need to cooperate in order to maximize their own utilities. Most of the multi-agent learning algorithms focus on...

Jose Enrique Munoz de Cote, Alessandro Lazaric, Ma...

claim paper

Read More »

152

click to vote

ESANN
2007

122views Neural Networks» more ESANN 2007»

The Recurrent Control Neural Network

15 years 8 months ago

Download www.dice.ucl.ac.be

This paper presents our Recurrent Control Neural Network (RCNN), which is a model-based approach for a data-eﬃcient modelling and control of reinforcement learning problems in di...

Anton Maximilian Schäfer, Steffen Udluft, Han...

claim paper

Read More »

195

click to vote

COLT
2008
Springer

179views Machine Learning» more COLT 2008»

Adapting to a Changing Environment: the Brownian Restless Bandits

15 years 8 months ago

Download research.microsoft.com

In the multi-armed bandit (MAB) problem there are k distributions associated with the rewards of playing each of k strategies (slot machine arms). The reward distributions are ini...

Aleksandrs Slivkins, Eli Upfal

claim paper

Read More »

158

click to vote

IOR
2010

99views more IOR 2010»

Dynamic Pricing with a Prior on Market Response

15 years 5 months ago

Download web.mit.edu

We study a problem of dynamic pricing faced by a vendor with limited inventory, uncertain about demand, aiming to maximize expected discounted revenue over an inﬁnite time horiz...

Vivek F. Farias, Benjamin Van Roy

claim paper

Read More »

« Prev « First page 9 / 12 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers