Sciweavers

GLOBECOM
2006
IEEE

Adaptive Learning of Transmission Control Policies for MIMO Fading Channels under Delay Constraint

14 years 4 months ago
Adaptive Learning of Transmission Control Policies for MIMO Fading Channels under Delay Constraint
— This paper addresses learning based adaptive resource allocation for wireless MIMO channels with Markovian fading. The problem is posed as Constrained Markov Decision Process with the goal of minimizing the average transmission cost (such as the transmission power) with the constraint on the average holding cost (such as the transmitter delay). Standard Q-learning algorithm is employed to adaptively find the optimal policy for unknown channel/traffic statistics, its convergence properties discussed and shown that it can relatively quickly compute the optimal policy even for rather large state spaces. In order to further improve the convergence rate of the standard Qlearning, we establish several structural results on the optimal policies. We show that the optimal transmission policy is monotonic in the buffer occupancy. This permits us to utilize the supermodularity of the Q-factors and form a structured Q-learning algorithm that increases the convergence rate with respect to the...
Dejan V. Djonin, Vikram Krishnamurthy
Added 11 Jun 2010
Updated 11 Jun 2010
Type Conference
Year 2006
Where GLOBECOM
Authors Dejan V. Djonin, Vikram Krishnamurthy
Comments (0)