Sciweavers

17 search results - page 4 / 4
» A parallel IEEE P754 decimal floating-point multiplier
Sort
View
ICPP
1999
IEEE
14 years 20 days ago
Impact on Performance of Fused Multiply-Add Units in Aggressive VLIW Architectures
Loops are the main time consuming part of programs based on floating point computations. The performance of the loops is limited either by recurrences in the computation or by the...
David López, Josep Llosa, Eduard Ayguad&eac...
ISPDC
2010
IEEE
13 years 6 months ago
Pretty Good Accuracy in Matrix Multiplication with GPUs
—With systems such as Road Runner, there is a trend in super computing to offload parallel tasks to special purpose co-processors, composed of many relatively simple scalar proc...
Matthew Badin, Lubomir Bic, Michael B. Dillencourt...