— This paper presents an architecture and a wrapper synthesis approach for the design of multi-clock systems-on-chips. We build upon the initial work on multi-clock latency-insensitive systems by Singh and Theobald [1], and provide a detailed system architecture with the following capabilities and benefits: (i) modules are stalled only when needed, thereby avoiding unnecessary stalling, (ii) adequate metastability resolution is provided, (iii) handshake interfaces between modules are high-performance and low-latency, i.e., capable of transferring data packets on every clock cycle, (iv) IP cores with large clock distribution delays are correctly handled, and (v) an automated approach is provided for wrapper synthesis from formal specifications. For wrapper synthesis, we have developed an automated tool which accepts interface specifications in a high-level language (Component Wrapper Language, or CWL [2]), and automatically produces gate-level implementations of wrapper circuitry t...