Search Sciweavers | Sciweavers

1166 search results - page 21 / 234

» Negotiating Using Rewards

172

click to vote

PKDD
2009
Springer

181views Data Mining» more PKDD 2009»

Active Learning for Reward Estimation in Inverse Reinforcement Learning

16 years 1 months ago

Download users.isr.ist.utl.pt

Abstract. Inverse reinforcement learning addresses the general problem of recovering a reward function from samples of a policy provided by an expert/demonstrator. In this paper, w...

Manuel Lopes, Francisco S. Melo, Luis Montesano

claim paper

Read More »

163

click to vote

ICML
2006
IEEE

142views Machine Learning» more ICML 2006»

An intrinsic reward mechanism for efficient exploration

16 years 7 months ago

Download www-anw.cs.umass.edu

How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exp...

Özgür Simsek, Andrew G. Barto

claim paper

Read More »

180

click to vote

IJCAI
2001

174views Artificial Intelligence» more IJCAI 2001»

Complexity of Probabilistic Planning under Average Rewards

15 years 8 months ago

Download www.informatik.uni-freiburg.de

A general and expressive model of sequential decision making under uncertainty is provided by the Markov decision processes (MDPs) framework. Complex applications with very large ...

Jussi Rintanen

claim paper

Read More »

144

Voted

SAC
2006
ACM

98views Applied Computing» more SAC 2006»

Implementing rule-based mechanisms for agent-based price negotiations

16 years 20 days ago

Download software.ucv.ro

This note describes a sample implementation of automated negotiations in an e-commerce modeling multi-agent system. A speciﬁc set of rules is used for enforcing negotiation mech...

Costin Badica, Adriana Badita, Maria Ganzha

claim paper

Read More »

161

click to vote

ALT
2007
Springer

119views Machine Learning» more ALT 2007»

Pseudometrics for State Aggregation in Average Reward Markov Decision Processes

16 years 3 months ago

Download personal.unileoben.ac.at

We consider how state similarity in average reward Markov decision processes (MDPs) may be described by pseudometrics. Introducing the notion of adequate pseudometrics which are we...

Ronald Ortner

claim paper

Read More »

« Prev « First page 21 / 234 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers