Instruction aggregation—the grouping of multiple operations into a single processing unit—is a technique that has recently been used to amplify the bandwidth and capacity of critical processor structures. This amplification can be used to improve IPC or to maintain IPC while reducing physical resources. Mini-graph processing is a particular instruction aggregation technique that targets dynamically-scheduled superscalar processors and achieves bandwidth and capacity amplification throughout the pipeline. The dark side of aggregation is serialization. External serialization is an effect common to many aggregation schemes. An aggregate cannot issue until all of its external inputs are ready. If the last-arriving input to an aggregate feeds what is not the first instruction, the entire aggregate can be delayed. Mini-graphs additionally suffer from internal serialization. Serialization can degrade performance, sometimes to the point of overwhelming the benefits of aggregation. Thi...