Sciweavers

1166 search results - page 92 / 234
» Negotiating Using Rewards
Sort
View
IJCAI
2007
15 years 3 months ago
Forward Search Value Iteration for POMDPs
Recent scaling up of POMDP solvers towards realistic applications is largely due to point-based methods which quickly converge to an approximate solution for medium-sized problems...
Guy Shani, Ronen I. Brafman, Solomon Eyal Shimony
ECML
2004
Springer
15 years 7 months ago
Filtered Reinforcement Learning
Reinforcement learning (RL) algorithms attempt to assign the credit for rewards to the actions that contributed to the reward. Thus far, credit assignment has been done in one of t...
Douglas Aberdeen
111
Voted
NSDI
2010
15 years 3 months ago
Contracts: Practical Contribution Incentives for P2P Live Streaming
PPLive is a popular P2P video system used daily by millions of people worldwide. Achieving this level of scalability depends on users making contributions to the system, but curre...
Michael Piatek, Arvind Krishnamurthy, Arun Venkata...
RAS
2010
131views more  RAS 2010»
15 years 17 days ago
Probabilistic Policy Reuse for inter-task transfer learning
Policy Reuse is a reinforcement learning technique that efficiently learns a new policy by using past similar learned policies. The Policy Reuse learner improves its exploration b...
Fernando Fernández, Javier García, M...
108
Voted
ICML
2003
IEEE
16 years 3 months ago
The Cross Entropy Method for Fast Policy Search
We present a learning framework for Markovian decision processes that is based on optimization in the policy space. Instead of using relatively slow gradient-based optimization al...
Shie Mannor, Reuven Y. Rubinstein, Yohai Gat