Sciweavers

ECAI
2000
Springer
14 years 4 months ago
Efficient Asymptotic Approximation in Temporal Difference Learning
Abstract. TD(
Frédérick Garcia, Florent Serre
FSTTCS
2006
Springer
14 years 4 months ago
Testing Probabilistic Equivalence Through Reinforcement Learning
We propose a new approach to verification of probabilistic processes for which the model may not be available. We use a technique from Reinforcement Learning to approximate how far...
Josee Desharnais, François Laviolette, Sami...
KDD
2010
ACM
282views Data Mining» more  KDD 2010»
14 years 4 months ago
Optimizing debt collections using constrained reinforcement learning
In this paper, we propose and develop a novel approach to the problem of optimally managing the tax, and more generally debt, collections processes at financial institutions. Our...
Naoki Abe, Prem Melville, Cezar Pendus, Chandan K....
ACMICEC
2007
ACM
154views ECommerce» more  ACMICEC 2007»
14 years 4 months ago
Learning and adaptivity in interactive recommender systems
Recommender systems are intelligent E-commerce applications that assist users in a decision-making process by offering personalized product recommendations during an interaction s...
Tariq Mahmood, Francesco Ricci
ISCC
2000
IEEE
104views Communications» more  ISCC 2000»
14 years 4 months ago
Dynamic Routing and Wavelength Assignment Using First Policy Iteration
With standard assumptions the routing and wavelength assignment problem (RWA) can be viewed as a Markov Decision Process (MDP). The problem, however, defies an exact solution bec...
Esa Hyytiä, Jorma T. Virtamo
ICVS
2001
Springer
14 years 4 months ago
Adapting Object Recognition across Domains: A Demonstration
High-level vision systems use object, scene or domain specific knowledge to interpret images. Unfortunately, this knowledge has to be acquired for every domain. This makes it diffi...
Bruce A. Draper, Ulrike Ahlrichs, Dietrich Paulus
ICML
2006
IEEE
14 years 6 months ago
Automatic basis function construction for approximate dynamic programming and reinforcement learning
We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results ...
Philipp W. Keller, Shie Mannor, Doina Precup
FCCM
2006
IEEE
106views VLSI» more  FCCM 2006»
14 years 6 months ago
Scalable Hardware Architecture for Real-Time Dynamic Programming Applications
Abstract— This paper introduces a novel architecture for performing the core computations required by dynamic programming (DP) techniques. The latter pertain to a vast range of a...
Brad Matthews, Itamar Elhanany
VTC
2008
IEEE
173views Communications» more  VTC 2008»
14 years 6 months ago
Adaptive Call Admission Control with Dynamic Resource Reallocation for Cell-Based Multirate Wireless Systems
—This paper studies the admission control and resource allocation in a cell-based wireless system that supports singlemedia and multirate services. Utilizing the idea of adaptive...
Kai-Wei Ke, Chen-Nien Tsai, Ho-Ting Wu, Chia-Hao H...
CDC
2008
IEEE
140views Control Systems» more  CDC 2008»
14 years 7 months ago
Information state for Markov decision processes with network delays
We consider a networked control system, where each subsystem evolves as a Markov decision process (MDP). Each subsystem is coupled to its neighbors via communication links over wh...
Sachin Adlakha, Sanjay Lall, Andrea J. Goldsmith