Sciweavers

185 search results - page 17 / 37
» Energy-oriented compiler optimizations for partitioned memor...
Sort
View
ACMMSP
2004
ACM
92views Hardware» more  ACMMSP 2004»
14 years 2 months ago
Instruction combining for coalescing memory accesses using global code motion
Instruction combining is an optimization to replace a sequence of instructions with a more efficient instruction yielding the same result in a fewer machine cycles. When we use it...
Motohiro Kawahito, Hideaki Komatsu, Toshio Nakatan...
EUROPAR
2007
Springer
14 years 3 months ago
Toward Scalable Matrix Multiply on Multithreaded Architectures
We show empirically that some of the issues that affected the design of linear algebra libraries for distributed memory architectures will also likely affect such libraries for s...
Bryan Marker, Field G. Van Zee, Kazushige Goto, Gr...
VLSISP
2008
159views more  VLSISP 2008»
13 years 8 months ago
Effective Code Generation for Distributed and Ping-Pong Register Files: A Case Study on PAC VLIW DSP Cores
The compiler is generally regarded as the most important software component that supports a processor design to achieve success. This paper describes our application of the open re...
Yung-Chia Lin, Chia-Han Lu, Chung-Ju Wu, Chung-Lin...
PLDI
2004
ACM
14 years 2 months ago
Vectorization for SIMD architectures with alignment constraints
When vectorizing for SIMD architectures that are commonly employed by today’s multimedia extensions, one of the new challenges that arise is the handling of memory alignment. Pr...
Alexandre E. Eichenberger, Peng Wu, Kevin O'Brien
DATE
2006
IEEE
202views Hardware» more  DATE 2006»
14 years 3 months ago
Automatic systemC design configuration for a faster evaluation of different partitioning alternatives
In this paper we present a methodology that is based on SystemC [1] for rapid prototyping to greatly enhance and accelerate the exploration of complex systems to optimize the syst...
Nico Bannow, Karsten Haug, Wolfgang Rosenstiel