This paper presents an enhancement of our "Algorithm Architecture Adequation" (AAA) prototyping methodology which allows to rapidly develop and optimize the implementation of a reactive real-time dataflow algorithm on a embedded heterogeneous multiprocessor architecture, predict its real-time behavior and automatically generate the corresponding distributed and optimized static executive. It describes a new optimization heuristic able to support heterogeneous architectures and takes into account accurately inter-processor communications, which are usually neglected but may reduce dramatically multiprocessor performances.