Sciweavers

1166 search results - page 92 / 234
» Negotiating Using Rewards
Sort
View
IJCAI
2007
13 years 11 months ago
Forward Search Value Iteration for POMDPs
Recent scaling up of POMDP solvers towards realistic applications is largely due to point-based methods which quickly converge to an approximate solution for medium-sized problems...
Guy Shani, Ronen I. Brafman, Solomon Eyal Shimony
ECML
2004
Springer
14 years 3 months ago
Filtered Reinforcement Learning
Reinforcement learning (RL) algorithms attempt to assign the credit for rewards to the actions that contributed to the reward. Thus far, credit assignment has been done in one of t...
Douglas Aberdeen
NSDI
2010
13 years 11 months ago
Contracts: Practical Contribution Incentives for P2P Live Streaming
PPLive is a popular P2P video system used daily by millions of people worldwide. Achieving this level of scalability depends on users making contributions to the system, but curre...
Michael Piatek, Arvind Krishnamurthy, Arun Venkata...
RAS
2010
131views more  RAS 2010»
13 years 8 months ago
Probabilistic Policy Reuse for inter-task transfer learning
Policy Reuse is a reinforcement learning technique that efficiently learns a new policy by using past similar learned policies. The Policy Reuse learner improves its exploration b...
Fernando Fernández, Javier García, M...
ICML
2003
IEEE
14 years 11 months ago
The Cross Entropy Method for Fast Policy Search
We present a learning framework for Markovian decision processes that is based on optimization in the policy space. Instead of using relatively slow gradient-based optimization al...
Shie Mannor, Reuven Y. Rubinstein, Yohai Gat