In this paper we propose an architecture for the computation of the double—precision floating—point multiply—add fused (MAF) operation A + (B × C) that permits to compute ...
The paper introduces fine-grain clockgating schemes for fused multiply-add-type floating-point units (FPU). The clockgating is based on instruction type, precision and operand v...
Loops are the main time consuming part of programs based on floating point computations. The performance of the loops is limited either by recurrences in the computation or by the...
—Due to the widespread use and inherent complexity of floating-point addition, much effort has been devoted to its speedup via algorithmic and circuit techniques. We propose a ne...
The design and implementation of a double precision floating-point IEEE-754 standard adder is described which uses "flagged prefix addition" to merge rounding with the s...
Andrew Beaumont-Smith, Neil Burgess, S. Lefrere, C...