Sciweavers

ASAP
2005
IEEE

Performance Comparison of SIMD Implementations of the Discrete Wavelet Transform

14 years 5 months ago
Performance Comparison of SIMD Implementations of the Discrete Wavelet Transform
This paper focuses on SIMD implementations of the 2D discrete wavelet transform (DWT). The transforms considered are Daubechies’ real-to-real method of four coefficients (Daub-4) and the integer-to-integer (5, 3) lifting scheme. Daub-4 is implemented using SSE and the lifting scheme using MMX, and their performance is compared to C implementations on a Pentium 4 processor. The MMX implementation of the lifting scheme is up to 4.0x faster than the corresponding C program for a 1-level 2D DWT, while the SSE implementation of Daub-4 is up to 2.6x faster than the C version. It is shown that for some image sizes, the performance is significantly hampered by the so-called 64K aliasing problem, which occurs in the Pentium 4 when two data blocks are accessed that are a multiple of 64K apart. It is also shown that for the (5, 3) lifting scheme, a 12-bit word size is sufficient for a 5-level decomposition of the 2D DWT for images of up to 10 bits per pixel.
Asadollah Shahbahrami, Ben H. H. Juurlink, Stamati
Added 24 Jun 2010
Updated 24 Jun 2010
Type Conference
Year 2005
Where ASAP
Authors Asadollah Shahbahrami, Ben H. H. Juurlink, Stamatis Vassiliadis
Comments (0)