Parallel processing networks, even full crossbars, that only implement point-to-point and multicast message passing are inefficient for collective communications because multiple messages must be transmitted to/from each processorto implement a single collective operation. However, all of the information needed for a collective communication can be made available to the network control logic within a single communication. By making this control logic capable of executing functions on the information aggregated from all of the processors, any collective communication can be implemented without additional messages or processor involvement. Networks with such logic are called aggregate networks and are capable of performing routing, computation, and storage/retrieval of global information. This paper gives a detailed example of each of these types of aggregate functions.
Raymond Hoare, Henry G. Dietz