Sciweavers

453 search results - page 14 / 91
» Execution and Cache Performance of the Scheduled Dataflow Ar...
Sort
View
IPPS
1999
IEEE
13 years 12 months ago
Dynamically Scheduling the Trace Produced During Program Execution into VLIW Instructions
VLIW machines possibly provide the most direct way to exploit instruction level parallelism; however, they cannot be used to emulate current general-purpose instruction set archit...
Alberto Ferreira de Souza, Peter Rounce
ICPADS
2002
IEEE
14 years 16 days ago
Evaluating and Improving Performance of Multimedia Applications on Simultaneous Multi-Threading
This paper presents the study and results of running several core multimedia applications on a simultaneous multithreading (SMT) architecture, including some detailed analysis ran...
Yen-Kuang Chen, Eric Debes, Rainer Lienhart, Matth...
MICRO
2008
IEEE
79views Hardware» more  MICRO 2008»
13 years 7 months ago
Strategies for mapping dataflow blocks to distributed hardware
Distributed processors must balance communication and concurrency. When dividing instructions among the processors, key factors are the available concurrency, criticality of depen...
Behnam Robatmili, Katherine E. Coons, Doug Burger,...
ISCA
2010
IEEE
232views Hardware» more  ISCA 2010»
14 years 21 days ago
Data marshaling for multi-core architectures
Previous research has shown that Staged Execution (SE), i.e., dividing a program into segments and executing each segment at the core that has the data and/or functionality to bes...
M. Aater Suleman, Onur Mutlu, José A. Joao,...
DATE
2008
IEEE
165views Hardware» more  DATE 2008»
14 years 2 months ago
Dynamic Round-Robin Task Scheduling to Reduce Cache Misses for Embedded Systems
Modern embedded CPU systems rely on a growing number of software features, but this growth increases the memory footprint and increases the need for efficient instruction and data...
Ken W. Batcher, Robert A. Walker