We introduce a simple hierarchical design technique for using dynamic domino circuits to build high-performance self-timed data path circuits. We wrap the dynamic domino circuit in a wrapper that communicates using a request/acknowledge protocol and mediates the pre-charge/evaluate cycle of the dynamic logic. We apply standard bundled delay matching for completion detection but add an early completion feature that can signal completion if function validity can be determined from the output value. We call the resulting wrapper semi-bundled because of this early acknowledge. The circuit overhead required for this semibundled feature is relatively small, but can provide measurable speedup in some situations. The technique is suitable for any dynamic logic family that has a pre-charge/evaluate cycle, and that produces monotonic output transitions. Categories and Subject Descriptors B.6.1 [Logic Design]: Design Styles – Combinational Logic, Sequential Logic. General Terms Performance, De...