Sciweavers

555 search results - page 85 / 111
» Efficient event-driven simulation of parallel processor arch...
Sort
View
ASPLOS
2009
ACM
14 years 9 months ago
QR decomposition on GPUs
QR decomposition is a computationally intensive linear algebra operation that factors a matrix A into the product of a unitary matrix Q and upper triangular matrix R. Adaptive sys...
Andrew Kerr, Dan Campbell, Mark Richards
ISCC
2005
IEEE
119views Communications» more  ISCC 2005»
14 years 2 months ago
A Systematic Approach to Building High Performance Software-Based CRC Generators
—A framework for designing a family of novel fast CRC generation algorithms is presented. Our algorithms can ideally read arbitrarily large amounts of data at a time, while optim...
Michael E. Kounavis, Frank L. Berry
JPDC
2000
141views more  JPDC 2000»
13 years 8 months ago
A System for Evaluating Performance and Cost of SIMD Array Designs
: SIMD arrays are likely to become increasingly important as coprocessors in domain specific systems as architects continue to leverage RAM technology in their design. The problem ...
Martin C. Herbordt, Jade Cravy, Renoy Sam, Owais K...
EGH
2009
Springer
13 years 6 months ago
Efficient ray traced soft shadows using multi-frusta tracing
Ray tracing has long been considered to be superior to rasterization because its ability to trace arbitrary rays, allowing it to simulate virtually any physical light transport ef...
Carsten Benthin, Ingo Wald
IPCCC
2006
IEEE
14 years 2 months ago
OS-aware tuning: improving instruction cache energy efficiency on system workloads
Low power has been considered as an important issue in instruction cache (I-cache) designs. Several studies have shown that the I-cache can be tuned to reduce power. These techniq...
Tao Li, Lizy K. John