In deep submicron circuits, thermal hot spots and high temperature gradients increase the cooling costs, and degrade reliability and performance. In this paper, we propose a low-cost temperature management strategy for multicore systems to reduce the adverse effects of hot spots and temperature variations. Our technique utilizes online learning to select the best policy for the current workload characteristics among a given set of expert policies. We achieve 20% and 60% average decrease in the frequency of hot spots and thermal cycles respectively in comparison to the best performing expert, and reduce the spatial gradients to below 5%. Categories and Subject Descriptors: B.8 [Performance and Reliability]: General; C.4 [Computer Systems Organization]: Performance of Systems. General Terms: Management, Design, Reliability.