Sciweavers

54 search results - page 7 / 11
» Using a Swap Instruction to Coalesce Loads and Stores
Sort
View
DSD
2009
IEEE
148views Hardware» more  DSD 2009»
14 years 2 months ago
SIMD Architectural Enhancements to Improve the Performance of the 2D Discrete Wavelet Transform
—The 2D Discrete Wavelet Transform (DWT) is a time-consuming kernel in many multimedia applications such as JPEG2000 and MPEG-4. The 2D DWT consists of horizontal filtering alon...
Asadollah Shahbahrami, Ben H. H. Juurlink
HPCA
2004
IEEE
14 years 7 months ago
Out-of-Order Commit Processors
Modern out-of-order processors tolerate long latency memory operations by supporting a large number of inflight instructions. This is particularly useful in numerical applications...
Adrián Cristal, Daniel Ortega, Josep Llosa,...
ISLPED
2003
ACM
88views Hardware» more  ISLPED 2003»
14 years 19 days ago
Reducing data cache energy consumption via cached load/store queue
High-performance processors use a large set–associative L1 data cache with multiple ports. As clock speeds and size increase such a cache consumes a significant percentage of t...
Dan Nicolaescu, Alexander V. Veidenbaum, Alexandru...
ISCA
2008
IEEE
185views Hardware» more  ISCA 2008»
13 years 7 months ago
From Speculation to Security: Practical and Efficient Information Flow Tracking Using Speculative Hardware
Dynamic information flow tracking (also known as taint tracking) is an appealing approach to combat various security attacks. However, the performance of applications can severely...
Haibo Chen, Xi Wu, Liwei Yuan, Binyu Zang, Pen-Chu...
ICCS
2003
Springer
14 years 18 days ago
Exploiting Stability to Reduce Time-Space Cost for Memory Tracing
Memory traces record the addresses touched by a program during its execution, enabling many useful investigations for understanding and predicting program performance. But complete...
Xiaofeng Gao, Allan Snavely