Sciweavers

254 search results - page 32 / 51
» Compilation Techniques for Out-of-Core Parallel Computations
Sort
View
HPDC
2008
IEEE
13 years 8 months ago
Code coverage, performance approximation and automatic recognition of idioms in scientific applications
Basic data flow patterns which we call idioms, such as stream, transpose, reduction, random access and stencil, are common in scientific numerical applications. We hypothesize tha...
Jiahua He, Allan Snavely, Rob F. Van der Wijngaart...
IEEEPACT
2000
IEEE
14 years 1 months ago
Global Register Partitioning
Modern computers have taken advantage of the instruction-level parallelism (ILP) available in programs with advances in both architecture and compiler design. Unfortunately, large...
Jason Hiser, Steve Carr, Philip H. Sweany
ICS
1999
Tsinghua U.
14 years 26 days ago
Eliminating synchronization bottlenecks in object-based programs using adaptive replication
This paper presents a technique, adaptive replication, for automatically eliminating synchronization bottlenecks in multithreaded programs that perform atomic operations on object...
Martin C. Rinard, Pedro C. Diniz
HPCA
1997
IEEE
14 years 24 days ago
Datapath Design for a VLIW Video Signal Processor
This paper represents a design study of the datapath for a very long instruction word (VLIW) video signal processor (VSP). VLIW architectures provide high parallelism and excellen...
Andrew Wolfe, Jason Fritts, Santanu Dutta, Edil S....
IEEEPACT
1999
IEEE
14 years 27 days ago
The Effect of Program Optimization on Trace Cache Efficiency
Trace cache, an instruction fetch technique that reduces taken branch penalties by storing and fetching program instructions in dynamic execution order, dramatically improves inst...
Derek L. Howard, Mikko H. Lipasti