Abstract-In this paper, a Q-learning-based hybrid automatic repeat request (Q-HARQ) scheme is proposed to achieve efficient resource utilization for high speed downlink packet access (HSDPA) in universal mobile telecommunications system (UMTS). The Hybrid ARQ procedure is modeled as a discretetime Markov decision process (MDP), where the transmission cost is defined in terms of the signal-to-interference-and-noise (SINR) which is based on the desired (quality-of-service) QoS parameters of transport block error rate (BLER) for enhancing spectrum utilization subject to QoS constraint. The Q-learning reinforcement algorithm is employed to accurately estimate the transmission cost to perform the most suitable decision of modulation and coding scheme for the packet initial transmission while the requirement of transport block error rate is guaranteed. Simulation results show that the QoS requirement of block error rate for Q-HARQ is nearly met around a reasonable value indeed. In addition...