Sciweavers

272 search results - page 35 / 55
» Code Transformations to Improve Memory Parallelism
Sort
View
EUC
2005
Springer
14 years 2 months ago
Optimizing Nested Loops with Iterational and Instructional Retiming
Abstract. Embedded systems have strict timing and code size requirements. Retiming is one of the most important optimization techniques to improve the execution time of loops by in...
Chun Xue, Zili Shao, Meilin Liu, Mei Kang Qiu, Edw...
IPPS
2005
IEEE
14 years 2 months ago
Fast Address Translation Techniques for Distributed Shared Memory Compilers
The Distributed Shared Memory (DSM) model is designed to leverage the ease of programming of the shared memory paradigm, while enabling the highperformance by expressing locality ...
François Cantonnet, Tarek A. El-Ghazawi, Pa...
ISPASS
2008
IEEE
14 years 3 months ago
Pinpointing and Exploiting Opportunities for Enhancing Data Reuse
—The potential for improving the performance of data-intensive scientific programs by enhancing data reuse in cache is substantial because CPUs are significantly faster than me...
Gabriel Marin, John M. Mellor-Crummey
IPPS
2009
IEEE
14 years 3 months ago
Validating Wrekavoc: A tool for heterogeneity emulation
Experimental validation and testing of solutions designed for heterogeneous environment is a challenging issue. Wrekavoc is a tool for performing such validation. It runs unmodi...
Olivier Dubuisson, Jens Gustedt, Emmanuel Jeannot
HPCA
2008
IEEE
14 years 9 months ago
Uncovering hidden loop level parallelism in sequential applications
As multicore systems become the dominant mainstream computing technology, one of the most difficult challenges the industry faces is the software. Applications with large amounts ...
Hongtao Zhong, Mojtaba Mehrara, Steven A. Lieberma...