Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games

15 years 9 months ago

Download www.cs.cmu.edu

In timed, zero-sum games, the goal is to maximize the probability of winning, which is not necessarily the same as maximizing our expected reward. We consider cumulative intermediate reward to be the difference between our score and our opponent’s score; the “true” reward of a win, loss, or tie is determined at the end of a game by applying a threshold function to the cumulative intermediate reward. We introduce thresholded-rewards problems to capture this dependency of the ﬁnal reward outcome on the cumulative intermediate reward. Thresholded-rewards problems reﬂect different real-world stochastic planning domains, especially zero-sum games, in which time and score need to be considered. We investigate the application of thresholded rewards to ﬁnitehorizon Markov Decision Processes (MDPs). In general, the optimal policy for a thresholded-rewards MDP will be nonstationary, depending on the number of time steps remaining and the cumulative intermediate reward. We introduce ...

Colin McMillen, Manuela M. Veloso

Real-time Traffic

AAAI 2007 | Cumulative Intermediate Reward | Intelligent Agents | Zero-sum Games | ﬁnal Reward Outcome |

claim paper

Added	02 Oct 2010
Updated	02 Oct 2010
Type	Conference
Year	2007
Where	AAAI
Authors	Colin McMillen, Manuela M. Veloso

Sciweavers

Thresholded Rewards: Acting Optimally in Timed, Zero-Sum Games

AAAI 2007 | Cumulative Intermediate Reward | Intelligent Agents | Zero-sum Games | ﬁnal Reward Outcome |

Explore & Download

Productivity Tools

Sciweavers