We present a novel approach that minimizes the power consumption of embedded core-based systems through hardware/software partitioning. Our approach is based on the idea of mapping clusters of operations/instructions to a core that yields a high utilization rate of the involved resources (ALUs, multipliers, shifters,) and thus minimizing power consumption. Our approach is comprehensive since it takes into consideration the power consumption of a whole embedded system comprising a microprocessor core, application specific (ASIC) core(s), cache cores and a memory core. We report high reductions of power consumption between 35% and 94% at the cost of a relatively small additional hardware overhead of less than 16k cells while maintaining or even slightly increasing the performance compared to the initial design.