—In this paper we develop an adaptive learning algorithm which is approximately optimal for an opportunistic spectrum access (OSA) problem with polynomial complexity. In this OSA problem each channel is modeled as a two state discrete time Markov chain with a bad state which yields no reward and a good state which yields reward. This is known as the Gilbert-Elliot channel model and represents variations in the channel condition due to fading, primary user activity, etc. There is a user who can transmit on one channel at a time, and whose goal is to maximize its throughput. Without knowing the transition probabilities and only observing the state of the channel currently selected, the user faces a partially observed Markov decision problem (POMDP) with unknown transition structure. In general, learning the optimal policy in this setting is intractable. We propose a computationally efficient learning algorithm which is approximately optimal for the infinite horizon average reward cri...