The application-specific multiprocessor System-on-a-Chip is a promising design alternative because of its high degree of flexibility, short development time, and potentially high performance attributed to application-specific optimizations. However, designing an optimal application-specific multiprocessor system is still challenging because there are a number of important metrics, such as throughput, latency, and resource usage, that need to be explored and optimized. This paper addresses the problem of synthesizing the application-specific multiprocessor system to minimize latency and resource usage under the throughput constraint. We employ a novel framework for this problem, similar to that of technology mapping in the logic synthesis domain, and develop a set of efficient algorithms, including labeling, clustering and packing, for efficient generation of the multiprocessor architecture with application-specific optimized latency and resources. Specifically, the result of our algor...