Sciweavers

294 search results - page 43 / 59
» Architectural Exploration and Optimization of Local Memory i...
Sort
View
ICCS
2009
Springer
14 years 2 months ago
Generating Empirically Optimized Composed Matrix Kernels from MATLAB Prototypes
The development of optimized codes is time-consuming and requires extensive architecture, compiler, and language expertise, therefore, computational scientists are often forced to ...
Boyana Norris, Albert Hartono, Elizabeth R. Jessup...
CODES
2008
IEEE
14 years 2 months ago
Static analysis of processor stall cycle aggregation
Processor Idle Cycle Aggregation (PICA) is a promising approach for low power execution of processors, in which small memory stalls are aggregated to create a large one, and the p...
Jongeun Lee, Aviral Shrivastava
IEEEPACT
2006
IEEE
14 years 2 months ago
Whole-program optimization of global variable layout
On machines with high-performance processors, the memory system continues to be a performance bottleneck. Compilers insert prefetch operations and reorder data accesses to improve...
Nathaniel McIntosh, Sandya Mannarswamy, Robert Hun...
SSDBM
2010
IEEE
248views Database» more  SSDBM 2010»
14 years 1 months ago
Client + Cloud: Evaluating Seamless Architectures for Visual Data Analytics in the Ocean Sciences
Science is becoming data-intensive, requiring new software architectures that can exploit resources at all scales: local GPUs for interactive visualization, server-side multi-core ...
Keith Grochow, Bill Howe, Mark Stoermer, Roger S. ...
IWOMP
2007
Springer
14 years 2 months ago
Supporting OpenMP on Cell
The Cell processor is a heterogeneous multi-core processor with one Power Processing Engine (PPE) core and eight Synergistic Processing Engine (SPE) cores. Each SPE has a directly...
Kevin O'Brien, Kathryn M. O'Brien, Zehra Sura, Ton...