Sciweavers

315 search results - page 39 / 63
» Improving Performance by Branch Reordering
Sort
View
HPCA
2007
IEEE
16 years 4 months ago
Thermal Herding: Microarchitecture Techniques for Controlling Hotspots in High-Performance 3D-Integrated Processors
3D integration technology greatly increases transistor density while providing faster on-chip communication. 3D implementations of processors can simultaneously provide both laten...
Kiran Puttaswamy, Gabriel H. Loh
ICCAD
2005
IEEE
127views Hardware» more  ICCAD 2005»
16 years 16 days ago
Hardware synthesis from guarded atomic actions with performance specifications
We present a new hardware synthesis methodology for guarded atomic actions (or rules), which satisfies performance-related scheduling specifications provided by the designer. The ...
Daniel L. Rosenband
ICCD
2000
IEEE
93views Hardware» more  ICCD 2000»
16 years 16 days ago
Cheap Out-of-Order Execution Using Delayed Issue
In superscalar architectures, out-of-order issue mechanisms increase performance by dynamically rescheduling instructions that cannot be statically reordered by the compiler. Whil...
J. P. Grossman
109
Voted
SC
2000
ACM
15 years 8 months ago
Automatically Tuned Collective Communications
The performance of the MPI’s collective communications is critical in most MPI-based applications. A general algorithm for a given collective communication operation may not giv...
Sathish S. Vadhiyar, Graham E. Fagg, Jack Dongarra
128
Voted
ICS
1999
Tsinghua U.
15 years 8 months ago
Nonlinear array layouts for hierarchical memory systems
Programming languages that provide multidimensional arrays and a flat linear model of memory must implement a mapping between these two domains to order array elements in memory....
Siddhartha Chatterjee, Vibhor V. Jain, Alvin R. Le...