Search Sciweavers | Sciweavers

539 search results - page 31 / 108

» Learning Monotonic Linear Functions

167

click to vote

ECML
2006
Springer

116views Machine Learning» more ECML 2006»

Scaling Model-Based Average-Reward Reinforcement Learning for Product Delivery

15 years 9 months ago

Download web.engr.oregonstate.edu

Reinforcement learning in real-world domains suffers from three curses of dimensionality: explosions in state and action spaces, and high stochasticity. We present approaches that ...

Scott Proper, Prasad Tadepalli

claim paper

Read More »

178

click to vote

APPROX
2009
Springer

138views Algorithms» more APPROX 2009»

Submodular Maximization over Multiple Matroids via Generalized Exchange Properties

16 years 17 days ago

Download www.math.princeton.edu

Submodular-function maximization is a central problem in combinatorial optimization, generalizing many important NP-hard problems including Max Cut in digraphs, graphs and hypergr...

Jon Lee, Maxim Sviridenko, Jan Vondrák

claim paper

Read More »

163

click to vote

IPCO
2010

148views Optimization» more IPCO 2010»

Prize-Collecting Steiner Network Problems

15 years 7 months ago

Download www.openu.ac.il

In the Steiner Network problem we are given a graph with edge-costs and connectivity requirements between node pairs , . The goal is to find a minimum-cost subgraph of that contain...

MohammadTaghi Hajiaghayi, Rohit Khandekar, Guy Kor...

claim paper

Read More »

169

click to vote

ATAL
2010
Springer

123views Intelligent Agents» more ATAL 2010»

Linear options

15 years 7 months ago

Download www.eecs.umich.edu

Learning, planning, and representing knowledge in large state t multiple levels of temporal abstraction are key, long-standing challenges for building flexible autonomous agents. ...

Jonathan Sorg, Satinder P. Singh

claim paper

Read More »

163

click to vote

ML
2002
ACM

154views Machine Learning» more ML 2002»

Technical Update: Least-Squares Temporal Difference Learning

15 years 5 months ago

Download www.research.rutgers.edu

TD() is a popular family of algorithms for approximate policy evaluation in large MDPs. TD() works by incrementally updating the value function after each observed transition. It h...

Justin A. Boyan

claim paper

Read More »

« Prev « First page 31 / 108 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers