Sciweavers

ISCA
2008
IEEE
150views Hardware» more  ISCA 2008»
14 years 3 months ago
Fetch-Criticality Reduction through Control Independence
Architectures that exploit control independence (CI) promise to remove in-order fetch bottlenecks, like branch mispredicts, instruction-cache misses and fetch unit stalls, from th...
Mayank Agarwal, Nitin Navale, Kshitiz Malik, Matth...
ISCA
2008
IEEE
109views Hardware» more  ISCA 2008»
14 years 3 months ago
Flexible Hardware Acceleration for Instruction-Grain Program Monitoring
Instruction-grain program monitoring tools, which check and analyze executing programs at the granularity of individual instructions, are invaluable for quickly detecting bugs and...
Shimin Chen, Michael Kozuch, Theodoros Strigkos, B...
ISCA
2008
IEEE
89views Hardware» more  ISCA 2008»
14 years 3 months ago
Variation-Aware Application Scheduling and Power Management for Chip Multiprocessors
Within-die process variation causes individual cores in a Chip Multiprocessor (CMP) to differ substantially in both static power consumed and maximum frequency supported. In this ...
Radu Teodorescu, Josep Torrellas
ISCA
2008
IEEE
130views Hardware» more  ISCA 2008»
14 years 3 months ago
Corona: System Implications of Emerging Nanophotonic Technology
We expect that many-core microprocessors will push performance per chip from the 10 gigaflop to the 10 teraflop range in the coming decade. To support this increased performance...
Dana Vantrease, Robert Schreiber, Matteo Monchiero...
ISCA
2008
IEEE
107views Hardware» more  ISCA 2008»
14 years 3 months ago
Understanding and Designing New Server Architectures for Emerging Warehouse-Computing Environments
This paper seeks to understand and design nextgeneration servers for emerging “warehousecomputing” environments. We make two key contributions. First, we put together a detail...
Kevin T. Lim, Parthasarathy Ranganathan, Jichuan C...
ISCA
2008
IEEE
205views Hardware» more  ISCA 2008»
14 years 3 months ago
VEAL: Virtualized Execution Accelerator for Loops
Performance improvement solely through transistor scaling is becoming more and more difficult, thus it is increasingly common to see domain specific accelerators used in conjunc...
Nathan Clark, Amir Hormati, Scott A. Mahlke
IEEEPACT
2008
IEEE
14 years 3 months ago
COMIC: a coherent shared memory interface for cell be
Jaejin Lee, Sangmin Seo, Chihun Kim, Junghyun Kim,...
IEEEPACT
2008
IEEE
14 years 3 months ago
Feature selection and policy optimization for distributed instruction placement using reinforcement learning
Communication overheads are one of the fundamental challenges in a multiprocessor system. As the number of processors on a chip increases, communication overheads and the distribu...
Katherine E. Coons, Behnam Robatmili, Matthew E. T...
IEEEPACT
2008
IEEE
14 years 3 months ago
Exploiting loop-dependent stream reuse for stream processors
The memory access limits the performance of stream processors. By exploiting the reuse of data held in the Stream Register File (SRF), an on-chip storage, the number of memory acc...
Xuejun Yang, Ying Zhang, Jingling Xue, Ian Rogers,...
IEEEPACT
2008
IEEE
14 years 3 months ago
Analysis and approximation of optimal co-scheduling on chip multiprocessors
Yunlian Jiang, Xipeng Shen, Jie Chen, Rahul Tripat...