Conventional high-level synthesis uses the worst case delay to relate all inputs to all outputs of an operation. This is a very conservative approximation of reality, especially in arithmetic operations (where some bits are required later than others and some bits are produced earlier than others). This paper proposes a pre-synthesis optimization algorithm that takes advantage of this feature for more efficient high-level synthesis of data-flow graphs formed by additions and multiplications. The presented pre-processor analyzes the critical path at bitgranularity and splits the arithmetic operations into subwords fragments. In particular, some of the specification multiplications are broken up into several smaller multiplications, additions, and other operations of three new types specially defined to reduce the clock cycle duration. These fragments become the input to any regular high-level synthesis tool to speed up circuit execution times. The experimental results carried out show ...
Rafael Ruiz-Sautua, María C. Molina, Jos&ea