—The performance estimation of complex multi-processor systems-on-chip (MPSoC) in a reasonable amount of time and with a good accuracy becomes more and more challenging due to the increasing number of embedded components and the resulting complex interactions. In this paper, we present a modular tracebased simulation framework, targeting the performance analysis of stream-oriented applications on complex MPSoC architectures. Our framework can analyze systems with a large number of hardware components, while considering various aspects like resource sharing, multi-hop communication, and memory allocation. We demonstrate the potential of our framework by real-life case studies and obtain a speedup of several orders of magnitude and an average accuracy of 97% when compared with the execution on a commercial instruction-accurate simulator.