Sciweavers

184 search results - page 25 / 37
» Compress-and-conquer for optimal multicore computing
Sort
View
DAC
2010
ACM
13 years 11 months ago
Trace-driven optimization of networks-on-chip configurations
Networks-on-chip (NoCs) are becoming increasingly important in general-purpose and application-specific multi-core designs. Although uniform router configurations are appropriate ...
Andrew B. Kahng, Bill Lin, Kambiz Samadi, Rohit Su...
ASPLOS
2009
ACM
14 years 9 months ago
RapidMRC: approximating L2 miss rate curves on commodity systems for online optimizations
Miss rate curves (MRCs) are useful in a number of contexts. In our research, online L2 cache MRCs enable us to dynamically identify optimal cache sizes when cache-partitioning a s...
David K. Tam, Reza Azimi, Livio Soares, Michael St...
EUROPAR
2007
Springer
14 years 2 months ago
Toward Scalable Matrix Multiply on Multithreaded Architectures
We show empirically that some of the issues that affected the design of linear algebra libraries for distributed memory architectures will also likely affect such libraries for s...
Bryan Marker, Field G. Van Zee, Kazushige Goto, Gr...
LCPC
2009
Springer
14 years 1 months ago
MIMD Interpretation on a GPU
Programming heterogeneous parallel computer systems is notoriously difficult, but MIMD models have proven to be portable across multi-core processors, clusters, and massively paral...
Henry G. Dietz, B. Dalton Young
PROCEDIA
2010
103views more  PROCEDIA 2010»
13 years 3 months ago
Towards generating optimised finite element solvers for GPUs from high-level specifications
We argue that producing maintainable high-performance implementations of finite element methods for multiple targets requires that they are written using a high-level domain-speci...
Graham R. Markall, David A. Ham, Paul H. J. Kelly