We present the design of a high-performance, highly pipelined asynchronous FPGA. We describe a very fine-grain pipelined logic block and routing interconnect architecture, and show how asynchronous logic can efficiently take advantage of this large amount of pipelining. Our FPGA, which does not use a clock to sequence computations, automatically “selfpipelines” its logic without the designer needing to be explicitly aware of all pipelining details. This property makes our FPGA ideal for throughput-intensive applications and we require minimal place and route support to achieve good performance. Benchmark circuits taken from both the asynchronous and clocked design communities yield throughputs in the neighborhood of 300–400 MHz in a TSMC 0.25µm process and 500–700 MHz in a TSMC 0.18µm process. Categories and Subject Descriptors B.7.1 [Integrated Circuit]: Types and Design Styles— Gate Arrays, VLSI ; B.6.1 [Logic Design]: Design Styles— Parallel Circuits General Terms De...