We present a de-layered protocol engine for termination of 40Gbps TCP connections using a reconfigurable FPGA silicon platform. This protocol engine is designed for a planned attempt at the Internet Speed Record. In laboratory demonstrations at 40Gbps, this core beat the previous record of 7.2Gbps by a factor of five. We present an aggressive crosslayer optimization methodology and corresponding designflow and tools used to implement this record-breaking TCP Protocol Engine. The 40Gbps TCP Offload Engine has been implemented on a Xilinx FPGA platform, based on a VirtexII-pro 2VP7 device. Each FPGA device terminates a 10Gbps OC-768 channel, and the aggregate capacity of the four FPGA devices is 40Gbps. The four 10Gbps channels are intended to be connected to four trunked 10GbE ethernet ports on a router. The 40Gbps TCP implementation has been demonstrated in the lab in system level as well as gate-level simulations, and live implementations have been tested with each 10Gbps channel...
H. Shrikumar