Sciweavers

451 search results - page 3 / 91
» Performance evaluation with temporal rewards
Sort
View
JAIR
2006
157views more  JAIR 2006»
13 years 7 months ago
Decision-Theoretic Planning with non-Markovian Rewards
A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decisiontheoretic...
Sylvie Thiébaux, Charles Gretton, John K. S...
VALUETOOLS
2006
ACM
164views Hardware» more  VALUETOOLS 2006»
14 years 1 months ago
Analysis of Markov reward models using zero-suppressed multi-terminal BDDs
High-level stochastic description methods such as stochastic Petri nets, stochastic UML statecharts etc., together with specifications of performance variables (PVs), enable a co...
Kai Lampka, Markus Siegle
AAAI
2006
13 years 9 months ago
QUICR-Learning for Multi-Agent Coordination
Coordinating multiple agents that need to perform a sequence of actions to maximize a system level reward requires solving two distinct credit assignment problems. First, credit m...
Adrian K. Agogino, Kagan Tumer
CHI
2010
ACM
14 years 2 months ago
Physical activity motivating games: virtual rewards for real activity
Contemporary lifestyle has become increasingly sedentary: little physical (sports, exercises) and much sedentary (TV, computers) activity. The nature of sedentary activity is self...
Shlomo Berkovsky, Mac Coombe, Jill Freyne, Dipak B...
AAAI
2006
13 years 9 months ago
Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance
As robots become a mass consumer product, they will need to learn new skills by interacting with typical human users. Past approaches have adapted reinforcement learning (RL) to a...
Andrea Lockerd Thomaz, Cynthia Breazeal