Sciweavers

244 search results - page 22 / 49
» Optimizing Loop Performance for Clustered VLIW Architectures
Sort
View
DATE
2010
IEEE
119views Hardware» more  DATE 2010»
14 years 27 days ago
Exploiting local logic structures to optimize multi-core SoC floorplanning
Abstract—We present a throughput-driven partitioning algorithm and a throughput-preserving merging algorithm for the high-level physical synthesis of latency-insensitive (LI) sys...
Cheng-Hong Li, Sampada Sonalkar, Luca P. Carloni
LCPC
2000
Springer
13 years 11 months ago
SmartApps: An Application Centric Approach to High Performance Computing
State-of-the-art run-time systems are a poor match to diverse, dynamic distributed applications because they are designed to provide support to a wide variety of applications, with...
Lawrence Rauchwerger, Nancy M. Amato, Josep Torrel...
CLUSTER
2007
IEEE
14 years 2 months ago
Balancing productivity and performance on the cell broadband engine
— The Cell Broadband Engine (BE) is a heterogeneous multicore processor, combining a general-purpose POWER architecture core with eight independent single-instructionmultiple-dat...
Sadaf R. Alam, Jeremy S. Meredith, Jeffrey S. Vett...
ISLPED
2004
ACM
169views Hardware» more  ISLPED 2004»
14 years 1 months ago
Delay optimal low-power circuit clustering for FPGAs with dual supply voltages
This paper presents a delay optimal FPGA clustering algorithm targeting low power. We assume that the configurable logic blocks of the FPGA can be programmed using either a high s...
Deming Chen, Jason Cong
CASES
2007
ACM
13 years 11 months ago
INTACTE: an interconnect area, delay, and energy estimation tool for microarchitectural explorations
Prior work on modeling interconnects has focused on optimizing the wire and repeater design for trading off energy and delay, and is largely based on low level circuit parameters....
Rahul Nagpal, Arvind Madan, Bharadwaj Amrutur, Y. ...