Sciweavers

315 search results - page 39 / 63
» Improving Performance by Branch Reordering
Sort
View
HPCA
2007
IEEE
14 years 9 months ago
Thermal Herding: Microarchitecture Techniques for Controlling Hotspots in High-Performance 3D-Integrated Processors
3D integration technology greatly increases transistor density while providing faster on-chip communication. 3D implementations of processors can simultaneously provide both laten...
Kiran Puttaswamy, Gabriel H. Loh
ICCAD
2005
IEEE
127views Hardware» more  ICCAD 2005»
14 years 5 months ago
Hardware synthesis from guarded atomic actions with performance specifications
We present a new hardware synthesis methodology for guarded atomic actions (or rules), which satisfies performance-related scheduling specifications provided by the designer. The ...
Daniel L. Rosenband
ICCD
2000
IEEE
93views Hardware» more  ICCD 2000»
14 years 5 months ago
Cheap Out-of-Order Execution Using Delayed Issue
In superscalar architectures, out-of-order issue mechanisms increase performance by dynamically rescheduling instructions that cannot be statically reordered by the compiler. Whil...
J. P. Grossman
SC
2000
ACM
14 years 1 months ago
Automatically Tuned Collective Communications
The performance of the MPI’s collective communications is critical in most MPI-based applications. A general algorithm for a given collective communication operation may not giv...
Sathish S. Vadhiyar, Graham E. Fagg, Jack Dongarra
ICS
1999
Tsinghua U.
14 years 1 months ago
Nonlinear array layouts for hierarchical memory systems
Programming languages that provide multidimensional arrays and a flat linear model of memory must implement a mapping between these two domains to order array elements in memory....
Siddhartha Chatterjee, Vibhor V. Jain, Alvin R. Le...