Sciweavers

338 search results - page 66 / 68
» Automated Performance Prediction of Message-Passing Parallel...
Sort
View
CODES
2005
IEEE
14 years 1 months ago
High-level synthesis for large bit-width multipliers on FPGAs: a case study
In this paper, we present the analysis, design and implementation of an estimator to realize large bit width unsigned integer multiplier units. Larger multiplier units are require...
Gang Quan, James P. Davis, Siddhaveerasharan Devar...
ASPLOS
2009
ACM
14 years 8 months ago
StreamRay: a stream filtering architecture for coherent ray tracing
The wide availability of commodity graphics processors has made real-time graphics an intrinsic component of the human/computer interface. These graphics cores accelerate the z-bu...
Karthik Ramani, Christiaan P. Gribble, Al Davis
OOPSLA
2005
Springer
14 years 26 days ago
X10: an object-oriented approach to non-uniform cluster computing
It is now well established that the device scaling predicted by Moore’s Law is no longer a viable option for increasing the clock frequency of future uniprocessor systems at the...
Philippe Charles, Christian Grothoff, Vijay A. Sar...
IEEEPACT
2000
IEEE
13 years 11 months ago
A Lightweight Algorithm for Dynamic If-Conversion during Dynamic Optimization
Dynamic Optimization is an umbrella term that refers to any optimization of software that is performed after the initial compile time. It is a complementary optimization opportuni...
Kim M. Hazelwood, Thomas M. Conte
ICS
1999
Tsinghua U.
13 years 11 months ago
Software trace cache
—This paper explores the use of compiler optimizations which optimize the layout of instructions in memory. The target is to enable the code to make better use of the underlying ...
Alex Ramírez, Josep-Lluis Larriba-Pey, Carl...