The Translation Look-aside Buffer (TLB) is a very important part in the hardware support for virtual memory management implementation of high performance embedded systems. The TLB though small is very frequently accessed, and therefore not only consumes significant energy, but also is one of the important thermal hot-spots in the processor. Recently, several circuit and microarchitectural implementations of TLBs have been proposed to reduce TLB power. One simple, yet effective TLB design for power reduction is the Use-Last TLB architecture proposed in [9]. The Use-Last TLB architecture reduces the power consumption when the last page is accessed again. While very effective for instruction TLB, this technique is not as effective for the data TLB. In this paper, we propose compiler techniques (specifically, instruction and operand reordering, array interleaving, and loop unrolling) to reduce the page switchings in data accesses. Our comprehensive page-switch reduction algorithm results ...