On-network hardware support for multi-destination traffic is a desirable feature in most multiprocessor machines. Multicast hardware capabilities enable much more effective bandwidth utilization as multidestination packets do not need to repeatedly use the same resources, as occurs when multicast traffic must be decomposed in unicast packets. Although Chip Multiprocessors are not an exception in this interest, up to date, few fitting proposals exist. The combination of the scarcity of available resources and the common idea that multicast support requires a substantial amount of extra resources is responsible for this situation. In this work, we propose a new approach suitable for on-chip networks capable of managing multi-destination traffic via hardware in an efficient way with negligible complexity. We introduce the Multicast Rotary Router (MRR), a router able to: (1) perform on-network multicast support with almost zero cost over the Rotary Router, (2) use a fully adaptive tree to...