Focusing on embedded applications, scratchpad memories (SPMs) look like a best-compromise solution when taking into account performance, energy consumption and die area. The main challenge in SPM design is mapping memory locations to scratchpad locations. This paper describes an algorithm to optimally solve such a mapping problem by means of Dynamic Programming applied to a synthesizable hardware architecture. The algorithm works by mapping segments of external memory to physically partitioned banks of an on-chip SPM; this architecture provides significant energy savings. The algorithm does not require any user-set bound on the number of partitions and takes into account partitioning overhead. Improving on previous solutions, execution time is polynomial in the input size. Strategies to optimize memory requirements and speed of the algorithm are exploited. Additionally, we integrate this algorithm in a complete and automated design, simulation and synthesis flow. Categories and Subj...