— We have simulated the implementation of 16-bit floating point instructions on a Pentium4 and PowerPC G4 and G5 to evaluate the performance impact of these instructions in embedded processors for graphics and multimedia applications. Both accuracy of the computations and the execution time have been considered. For low-end embedded processors, the 16-bit FP instructions deliver a larger dynamic range than 16-bit integer with the same memory footprint. For high-end embedded processors, we add the speed up coming from wider SIMD instructions.
Lionel Lacassagne, Daniel Etiemble, S. A. Ould Kab