Search Sciweavers | Sciweavers

166 search results - page 24 / 34

» Safe exploration for reinforcement learning

172

click to vote

ATAL
2010
Springer

146views Intelligent Agents» more ATAL 2010»

PAC-MDP learning with knowledge-based admissible models

15 years 6 months ago

Download www.aamas-conference.org

PAC-MDP algorithms approach the exploration-exploitation problem of reinforcement learning agents in an effective way which guarantees that with high probability, the algorithm pe...

Marek Grzes, Daniel Kudenko

claim paper

Read More »

157

click to vote

HYBRID
2005
Springer

102views Control Systems» more HYBRID 2005»

Learning Multi-modal Control Programs

15 years 11 months ago

Download users.ece.gatech.edu

Abstract. Multi-modal control is a commonly used design tool for breaking up complex control tasks into sequences of simpler tasks. In this paper, we show that by viewing the contr...

Tejas R. Mehta, Magnus Egerstedt

claim paper

Read More »

171

click to vote

IJCAI
2007

201views Artificial Intelligence» more IJCAI 2007»

Using Linear Programming for Bayesian Exploration in Markov Decision Processes

15 years 7 months ago

Download www.cs.mcgill.ca

A key problem in reinforcement learning is ﬁnding a good balance between the need to explore the environment and the need to gain rewards by exploiting existing knowledge. Much ...

Pablo Samuel Castro, Doina Precup

claim paper

Read More »

231

click to vote

KDD
2010
ACM

289views Data Mining» more KDD 2010»

Exploitation and exploration in a performance based contextual advertising system

15 years 4 months ago

Download www.cs.umass.edu

The dynamic marketplace in online advertising calls for ranking systems that are optimized to consistently promote and capitalize better performing ads. The streaming nature of on...

Wei Li 0010, Xuerui Wang, Ruofei Zhang, Ying Cui, ...

claim paper

Read More »

190

click to vote

EWCBR
2008
Springer

206views Automated Reasoning» more EWCBR 2008»

Forgetting Reinforced Cases

15 years 8 months ago

Download agora.ulaval.ca

To meet time constraints, a CBR system must control the time spent searching in the case base for a solution. In this paper, we presents the results of a case study comparing the p...

Houcine Romdhane, Luc Lamontagne

claim paper

Read More »

« Prev « First page 24 / 34 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers