state transition probabilities

189

AROBOTS
1999

104views more AROBOTS 1999»

Reinforcement Learning Soccer Teams with Incomplete World Models

15 years 6 months ago

We use reinforcement learning (RL) to compute strategies for multiagent soccer teams. RL may pro t signi cantly from world models (WMs) estimating state transition probabilities an...

Marco Wiering, Rafal Salustowicz, Jürgen Schm...

claim paper

Read More »

186

click to vote

CORR
2010
Springer

127views Education» more CORR 2010»

Online Algorithms for the Multi-Armed Bandit Problem with Markovian Rewards

15 years 6 months ago

Download wireless.cs.uh.edu

We consider the classical multi-armed bandit problem with Markovian rewards. When played an arm changes its state in a Markovian fashion while it remains frozen when not played. Th...

Cem Tekin, Mingyan Liu

claim paper

Read More »

174

click to vote

NIPS
2003

126views Information Technology» more NIPS 2003»

Robustness in Markov Decision Problems with Uncertain Transition Matrices

15 years 8 months ago

Download books.nips.cc

Optimal solutions to Markov Decision Problems (MDPs) are very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of those pro...

Arnab Nilim, Laurent El Ghaoui

claim paper

Read More »

184

click to vote

IJCAI
2007

180views Artificial Intelligence» more IJCAI 2007»

Dynamically Weighted Hidden Markov Model for Spam Deobfuscation

15 years 8 months ago

Download www.ijcai.org

Spam deobfuscation is a processing to detect obfuscated words appeared in spam emails and to convert them back to the original words for correct recognition. Lexicon tree hidden M...

Seunghak Lee, Iryoung Jeong, Seungjin Choi

claim paper

Read More »

153

click to vote

ICDAR
2003
IEEE

123views Document Analysis» more ICDAR 2003»

An HMM On-line Signature Verifier Incorporating Signature Trajectories

15 years 12 months ago

Download www.cse.salford.ac.uk

Authentication of individuals is rapidly becoming an important issue. On-line signature verification is one of the methods that use biometric features. This paper proposes a new H...

Daigo Muramatsu, Takashi Matsumoto

claim paper

Read More »

201

click to vote

ROBOCUP
2004
Springer

114views Robotics» more ROBOCUP 2004»

Modular Learning System and Scheduling for Behavior Acquisition in Multi-agent Environment

15 years 12 months ago

Download www.er.ams.eng.osaka-u.ac.jp

The existing reinforcement learning approaches have been suﬀering from the policy alternation of others in multiagent dynamic environments such as RoboCup competitions since othe...

Yasutake Takahashi, Kazuhiro Edazawa, Minoru Asada

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers