We consider the task of reinforcement learning with linear value function approximation. Temporal difference algorithms, and in particular the Least-Squares Temporal Difference (L...
Abstract. In this paper, we show that the proportional response dynamics, a utility based distributed dynamics, converges to the market equilibrium in the Fisher market with consta...
—The problem of maximizing the sum of the transmit rates while limiting the outage probability below an appropriate threshold is investigated for networks where the nodes have li...
M. D'Angelo, Carlo Fischione, Matteo Butussi, Ales...
In 1958, Wagner and Whitin published a seminal paper on the deterministic uncapacitated lot-sizing problem, a fundamental model that is embedded in many practical production plann...
Structured output prediction is an important machine learning problem both in theory and practice, and the max-margin Markov network (M3 N) is an effective approach. All state-of-...