On-chip networks are critical to the scaling of future multicore processors. The challenge for on-chip network is to reduce the cost including power consumption and area while providing high performance such as low latency and high bandwidth. Although much research in on-chip network have focused on improving the performance of on-chip networks, they have often relied on a router microarchitecture adopted from off-chip networks. As a result, the on-chip network architecture will not scale properly because of design complexity. In this paper, we propose a low-cost,on-chip network router microarchitecture which is different from the commonly assumed baseline router microarchitecture. We reduce the cost of on-chip networks by partitioning the crossbar, prioritizing packets in flight to simplify arbitration, and reducing the amount of buffers. We show that by introducing intermediate buffers to decouple the routing in the x and the y dimensions, high performance can be achieved with ...