In a variety of applications, ranging from data integration to distributed query evaluation, there is a need to obtain sets of data items from several sources (peers) and compute their union. As these sets often contain common data items, avoiding the transmission of redundant information is essential for effective union computation. In this paper we define the notion of optimal union plans for nondisjoint data sets residing on distinct peers, and present efficient algorithms for computing and executing such optimal plans. Our algorithms avoid redundant data transmission and optimally exploit the network bandwidth capabilities. A challenge in the design of optimal plans is the lack of a complete map of the distribution of the data items among peers. We analyze the information required for optimal planning and propose novel techniques to obtain compact, cheap to communicate, description of the data sources. We then exploit it for efficient union computation with reasonable accuracy....