Modern high speed interconnects such as Myrinet and Gigabit Ethernet have shifted the bottleneck in communication from the interconnect to the messaging software at the sending and receiving ends. The development of userlevel protocols and their implementations on smart and programmable network interface cards (NICs) have been alleviating this communication bottleneck. Most of the userlevel protocols developed so far have been based on singleCPU NICs. One of the more popular current generation Gigabit Ethernet NICs includes two CPUs, though. This raises an open challenge whether performance of user-level protocols can be improved by taking advantage of a multiCPU NIC. In this paper, we analyze the intrinsic issues associated with such a challenge and explore different parallelization and pipelining schemes to enhance the performance of our earlier developed EMP protocol for singleCPU Alteon NICs. Four different strategies are proposed and implemented on our testbed. Performance evalua...
Piyush Shivam, Pete Wyckoff, Dhabaleswar K. Panda