Adaptive Learning of Transmission Control Policies for MIMO Fading Channels under Delay Constraint

16 years 22 days ago

Download www.ece.ubc.ca

— This paper addresses learning based adaptive resource allocation for wireless MIMO channels with Markovian fading. The problem is posed as Constrained Markov Decision Process with the goal of minimizing the average transmission cost (such as the transmission power) with the constraint on the average holding cost (such as the transmitter delay). Standard Q-learning algorithm is employed to adaptively ﬁnd the optimal policy for unknown channel/trafﬁc statistics, its convergence properties discussed and shown that it can relatively quickly compute the optimal policy even for rather large state spaces. In order to further improve the convergence rate of the standard Qlearning, we establish several structural results on the optimal policies. We show that the optimal transmission policy is monotonic in the buffer occupancy. This permits us to utilize the supermodularity of the Q-factors and form a structured Q-learning algorithm that increases the convergence rate with respect to the...

Dejan V. Djonin, Vikram Krishnamurthy

Real-time Traffic

Convergence Rate | GLOBECOM 2006 | Optimal Policy | Standard Q-learning Algorithm | Telecommunications |

claim paper

» Transmission with Energy Harvesting Nodes in Fading Wireless Channels Optimal Policies

» Transmission control in cognitive radio as a Markovian dynamic game Structural result on r...

» CrossLayer Rate and Power Adaptation Strategies for IRHARQ Systems over Fading Channels wi...

Post Info
More Details (n/a)

Added	11 Jun 2010
Updated	11 Jun 2010
Type	Conference
Year	2006
Where	GLOBECOM
Authors	Dejan V. Djonin, Vikram Krishnamurthy

Comments (0)

Sciweavers

Adaptive Learning of Transmission Control Policies for MIMO Fading Channels under Delay Constraint

Convergence Rate | GLOBECOM 2006 | Optimal Policy | Standard Q-learning Algorithm | Telecommunications |

Explore & Download

Productivity Tools

Sciweavers