Sciweavers

1166 search results - page 48 / 234
» Negotiating Using Rewards
Sort
View
NIPS
2007
13 years 10 months ago
Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs
We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...
Ambuj Tewari, Peter L. Bartlett
AAAI
2006
13 years 10 months ago
Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance
As robots become a mass consumer product, they will need to learn new skills by interacting with typical human users. Past approaches have adapted reinforcement learning (RL) to a...
Andrea Lockerd Thomaz, Cynthia Breazeal
COLT
2010
Springer
13 years 6 months ago
Nonparametric Bandits with Covariates
We consider a bandit problem which involves sequential sampling from two populations (arms). Each arm produces a noisy reward realization which depends on an observable random cov...
Philippe Rigollet, Assaf Zeevi
ICSOC
2007
Springer
14 years 3 months ago
Negotiation of Service Level Agreements: An Architecture and a Search-Based Approach
Software systems built by composing existing services are more and more capturing the interest of researchers and practitioners. The envisaged long term scenario is that services, ...
Elisabetta Di Nitto, Massimiliano Di Penta, Alessi...
PRICAI
1999
Springer
14 years 1 months ago
Making Rational Decisions in N-by-N Negotiation Games with a Trusted Third Party
The optimal decision for an agent to make at a given game situation often depends on the decisions that other agents make at the same time. Rational agents will try to find a stabl...
Shih-Hung Wu, Von-Wun Soo