The high performance switch plays a critical role in the high performance computer (HPC) system. The applications of HPC not only demand on the low latency and high bandwidth of the switch, but also need the effective support of collective communication, such as broadcast, multicast, and barrier etc. In this paper, HPP Switch, as the core component of interconnection network of a HPC prototype, is introduced to meet these requirements. It is with 38.4ns zero-load latency, 160Gbps aggregated bandwidth, 16 multicast groups and 16 barrier groups. HPP Switch is implemented in a 0.13um CMOS standard cell ASIC technology. The simulation results show that the multicast and barrier operations for 1024 nodes are finished within 2us, and the single stage of barrier operation only needs 128ns.