This paper presents a method, called multiple constant multiplier trees MCMTs, for producing optimized recon gurable hardware implementations of vector products. An algorithm for generating MCMTs has been developed and implemented, which is based on a novel representation of common subexpressions in constant data patterns. Our optimization framework covers a wider solution space than previous approaches; it also supports exploitation of full and partial run-time recon guration as well as technologyspeci c constraints, such as fanout limits and routing. We demonstrate that while distributed arithmetic techniques require storage size exponential in the number of coe cients, the resource utilization of MCMTs usually grows linearly with problem size. MCMTs have been implemented in Xilinx 4000 and Virtex FPGAs, and their size and speed e ciency are con rmed in comparisons with Xilinx LogiCore and ASIC implementations of FIR lter designs. Preliminary results show that the size of MCMT circu...
Dan Benyamin, John D. Villasenor, Wayne Luk