Sciweavers

294 search results - page 43 / 59
» Architectural Exploration and Optimization of Local Memory i...
Sort
View
ICCS
2009
Springer
15 years 10 months ago
Generating Empirically Optimized Composed Matrix Kernels from MATLAB Prototypes
The development of optimized codes is time-consuming and requires extensive architecture, compiler, and language expertise, therefore, computational scientists are often forced to ...
Boyana Norris, Albert Hartono, Elizabeth R. Jessup...
CODES
2008
IEEE
15 years 10 months ago
Static analysis of processor stall cycle aggregation
Processor Idle Cycle Aggregation (PICA) is a promising approach for low power execution of processors, in which small memory stalls are aggregated to create a large one, and the p...
Jongeun Lee, Aviral Shrivastava
IEEEPACT
2006
IEEE
15 years 10 months ago
Whole-program optimization of global variable layout
On machines with high-performance processors, the memory system continues to be a performance bottleneck. Compilers insert prefetch operations and reorder data accesses to improve...
Nathaniel McIntosh, Sandya Mannarswamy, Robert Hun...
SSDBM
2010
IEEE
248views Database» more  SSDBM 2010»
15 years 9 months ago
Client + Cloud: Evaluating Seamless Architectures for Visual Data Analytics in the Ocean Sciences
Science is becoming data-intensive, requiring new software architectures that can exploit resources at all scales: local GPUs for interactive visualization, server-side multi-core ...
Keith Grochow, Bill Howe, Mark Stoermer, Roger S. ...
IWOMP
2007
Springer
15 years 10 months ago
Supporting OpenMP on Cell
The Cell processor is a heterogeneous multi-core processor with one Power Processing Engine (PPE) core and eight Synergistic Processing Engine (SPE) cores. Each SPE has a directly...
Kevin O'Brien, Kathryn M. O'Brien, Zehra Sura, Ton...