Search Sciweavers | Sciweavers

141

CSE
2009
IEEE

85views Theoretical Computer Science» more CSE 2009»

Reinforcement Learning of Listener Response for Mood Classification of Audio

16 years 22 days ago

This paper describes a method of applying a reinforcement learning artificial intelligence to categorize audio files by mood based on listener response during a performance. The s...

Jack Stockholm, Philippe Pasquier

claim paper

Read More »

132

click to vote

AI
2006
Springer

103views Artificial Intelligence» more AI 2006»

Trace Equivalence Characterization Through Reinforcement Learning

15 years 9 months ago

Download www2.ift.ulaval.ca

In the context of probabilistic verification, we provide a new notion of trace-equivalence divergence between pairs of Labelled Markov processes. This divergence corresponds to the...

Josee Desharnais, François Laviolette, Kris...

claim paper

Read More »

123

click to vote

AAAI
2007

68views Intelligent Agents» more AAAI 2007»

A Reinforcement Learning Algorithm with Polynomial Interaction Complexity for Only-Costly-Observable MDPs

15 years 8 months ago

Download www.aaai.org

An Unobservable MDP (UMDP) is a POMDP in which there are no observations. An Only-Costly-Observable MDP (OCOMDP) is a POMDP which extends an UMDP by allowing a particular costly a...

Roy Fox, Moshe Tennenholtz

claim paper

Read More »

161

click to vote

NECO
2010

97views more NECO 2010»

Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning

15 years 4 months ago

Download www.kyb.tuebingen.mpg.de

Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...

Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto...

claim paper

Read More »

189

click to vote

AGI
2011

231views Artificial Intelligence» more AGI 2011»

Reinforcement Learning and the Bayesian Control Rule

14 years 9 months ago

Download metatip.com

We present an actor-critic scheme for reinforcement learning in complex domains. The main contribution is to show that planning and I/O dynamics can be separated such that an intra...

Pedro Alejandro Ortega, Daniel Alexander Braun, Si...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers