Sciweavers

227 search results - page 12 / 46
» Limits to the Performance of Software Shared Memory: A Layer...
Sort
View
SIGMOD
2010
ACM
205views Database» more  SIGMOD 2010»
14 years 21 days ago
Performing sound flash device measurements: some lessons from uFLIP
It is amazingly easy to get meaningless results when measuring flash devices, partly because of the peculiarity of flash memory, but primarily because their behavior is determin...
Matias Bjørling, Lionel Le Folgoc, Ahmed Ms...
CODES
2004
IEEE
13 years 11 months ago
Optimizing the memory bandwidth with loop fusion
The memory bandwidth largely determines the performance and energy cost of embedded systems. At the compiler level, several techniques improve the memory bandwidth at the scope of...
Paul Marchal, José Ignacio Gómez, Fr...
IPPS
2007
IEEE
14 years 2 months ago
Programming Distributed Memory Sytems Using OpenMP
OpenMP has emerged as an important model and language extension for shared-memory parallel programming. On shared-memory platforms, OpenMP offers an intuitive, incremental approac...
Ayon Basumallik, Seung-Jai Min, Rudolf Eigenmann
ISCA
2012
IEEE
279views Hardware» more  ISCA 2012»
11 years 10 months ago
Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems
When multiple processor (CPU) cores and a GPU integrated together on the same chip share the off-chip main memory, requests from the GPU can heavily interfere with requests from t...
Rachata Ausavarungnirun, Kevin Kai-Wei Chang, Lava...
ISCA
1999
IEEE
87views Hardware» more  ISCA 1999»
14 years 5 days ago
Memory Forwarding: Enabling Aggressive Layout Optimizations by Guaranteeing the Safety of Data Relocation
By optimizing data layout at run-time, we can potentially enhance the performance of caches by actively creating spatial locality, facilitating prefetching, and avoiding cache con...
Chi-Keung Luk, Todd C. Mowry