We consider Markov Decision Processes (MDPs) as transformers on probability distributions, where with respect to a scheduler that resolves nondeterminism, the MDP can be seen as ex...
Vijay Anand Korthikanti, Mahesh Viswanathan, Gul A...
Cross-layer optimization aims at improving the performance of network users operating in a time-varying, error-prone wireless environment. However, current solutions often rely on...
Most algorithms for solving Markov decision processes rely on a discount factor, which ensures their convergence. It is generally assumed that using an artificially low discount f...
We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...
When modeling real-world decision-theoretic planning problems in the Markov decision process (MDP) framework, it is often impossible to obtain a completely accurate estimate of tr...
Karina Valdivia Delgado, Scott Sanner, Leliane Nun...