Exponentially rising cooling/packaging costs due to high power density call for architectural and software-level thermal management. Dynamic thermal management (DTM) techniques continuously monitor the on-chip processor temperature. Appropriate mechanisms (e.g., dynamic voltage or frequency scaling (DVFS), clock gating, fetch gating, etc.) are engaged to lower the temperature if it exceeds a threshold. However, all these mechanisms incur significant performance penalty. We argue that runtime adaptation of micro-architectural parameters, such as instruction window size and issue width, is a more effective mechanism for DTM. If the architectural parameters can be tailored to track the available instruction-level parallelism of the program, the temperature is reduced with minimal performance degradation. Moreover, synergistically combining architectural adaptation with DVFS and fetch gating can achieve the best performance under thermal constraints. The key difficulty in using multiple m...