Sciweavers

451 search results - page 3 / 91
» Temporal Rewards for Performance Evaluation
Sort
View
160
Voted
JAIR
2006
157views more  JAIR 2006»
15 years 3 months ago
Decision-Theoretic Planning with non-Markovian Rewards
A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decisiontheoretic...
Sylvie Thiébaux, Charles Gretton, John K. S...
158
Voted
VALUETOOLS
2006
ACM
164views Hardware» more  VALUETOOLS 2006»
15 years 9 months ago
Analysis of Markov reward models using zero-suppressed multi-terminal BDDs
High-level stochastic description methods such as stochastic Petri nets, stochastic UML statecharts etc., together with specifications of performance variables (PVs), enable a co...
Kai Lampka, Markus Siegle
AAAI
2006
15 years 5 months ago
QUICR-Learning for Multi-Agent Coordination
Coordinating multiple agents that need to perform a sequence of actions to maximize a system level reward requires solving two distinct credit assignment problems. First, credit m...
Adrian K. Agogino, Kagan Tumer
138
Voted
CHI
2010
ACM
15 years 10 months ago
Physical activity motivating games: virtual rewards for real activity
Contemporary lifestyle has become increasingly sedentary: little physical (sports, exercises) and much sedentary (TV, computers) activity. The nature of sedentary activity is self...
Shlomo Berkovsky, Mac Coombe, Jill Freyne, Dipak B...
156
Voted
AAAI
2006
15 years 5 months ago
Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance
As robots become a mass consumer product, they will need to learn new skills by interacting with typical human users. Past approaches have adapted reinforcement learning (RL) to a...
Andrea Lockerd Thomaz, Cynthia Breazeal