Sciweavers

771 search results - page 74 / 155
» Markov Decision Processes with Arbitrary Reward Processes
Sort
View
JMLR
2006
143views more  JMLR 2006»
15 years 4 months ago
Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation
We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...
Rémi Munos
EUROPKI
2004
Springer
15 years 9 months ago
A Probabilistic Model for Evaluating the Operational Cost of PKI-based Financial Transactions
The use of PKI in large scale environments suffers some inherent problems concerning the options to adopt for the optimal cost-centered operation of the system. In this paper a Mar...
Agapios N. Platis, Costas Lambrinoudakis, Assimaki...
ATAL
2009
Springer
15 years 10 months ago
Online exploration in least-squares policy iteration
One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...
Lihong Li, Michael L. Littman, Christopher R. Mans...
STACS
1997
Springer
15 years 8 months ago
Methods and Applications of (MAX, +) Linear Algebra
Exotic semirings such as the “(max, +) semiring” (R ∪ {−∞}, max, +), or the “tropical semiring” (N ∪ {+∞}, min, +), have been invented and reinvented many times s...
Stephane Gaubert, Max Plus
AI
2006
Springer
15 years 8 months ago
An Efficient Resource Allocation Approach in Real-Time Stochastic Environment
We are interested in contributing to solving effectively a particular type of real-time stochastic resource allocation problem. Firstly, one distinction is that certain tasks may c...
Pierrick Plamondon, Brahim Chaib-draa, Abder Rezak...