Sciweavers

272 search results - page 34 / 55
» Code Transformations to Improve Memory Parallelism
Sort
View
WMPI
2004
ACM
14 years 2 months ago
The Opie compiler from row-major source to Morton-ordered matrices
The Opie Project aims to develop a compiler to transform C codes written for row-major matrix representation into equivalent codes for Morton-order matrix representation, and to a...
Steven T. Gabriel, David S. Wise
MICRO
2000
IEEE
176views Hardware» more  MICRO 2000»
13 years 8 months ago
An Advanced Optimizer for the IA-64 Architecture
level of abstraction, compared with the program representation for scalar optimizations. For example, loop unrolling and loop unrolland-jam transformations exploit the large regist...
Rakesh Krishnaiyer, Dattatraya Kulkarni, Daniel M....
ICS
2010
Tsinghua U.
14 years 1 months ago
Overlapping communication and computation by using a hybrid MPI/SMPSs approach
– Communication overhead is one of the dominant factors that affect performance in high-performance computing systems. To reduce the negative impact of communication, programmers...
Vladimir Marjanovic, Jesús Labarta, Eduard ...
HPCC
2009
Springer
14 years 1 months ago
On Instruction-Level Method for Reducing Cache Penalties in Embedded VLIW Processors
Usual cache optimisation techniques for high performance computing are difficult to apply in embedded VLIW applications. First, embedded applications are not always well structur...
Samir Ammenouche, Sid Ahmed Ali Touati, William Ja...
ICS
2007
Tsinghua U.
14 years 2 months ago
Scheduling FFT computation on SMP and multicore systems
Increased complexity of memory systems to ameliorate the gap between the speed of processors and memory has made it increasingly harder for compilers to optimize an arbitrary code...
Ayaz Ali, S. Lennart Johnsson, Jaspal Subhlok