Sciweavers

244 search results - page 22 / 49
» Optimizing Loop Performance for Clustered VLIW Architectures
Sort
View
113
Voted
DATE
2010
IEEE
119views Hardware» more  DATE 2010»
15 years 8 months ago
Exploiting local logic structures to optimize multi-core SoC floorplanning
Abstract—We present a throughput-driven partitioning algorithm and a throughput-preserving merging algorithm for the high-level physical synthesis of latency-insensitive (LI) sys...
Cheng-Hong Li, Sampada Sonalkar, Luca P. Carloni
136
Voted
LCPC
2000
Springer
15 years 7 months ago
SmartApps: An Application Centric Approach to High Performance Computing
State-of-the-art run-time systems are a poor match to diverse, dynamic distributed applications because they are designed to provide support to a wide variety of applications, with...
Lawrence Rauchwerger, Nancy M. Amato, Josep Torrel...
144
Voted
CLUSTER
2007
IEEE
15 years 10 months ago
Balancing productivity and performance on the cell broadband engine
— The Cell Broadband Engine (BE) is a heterogeneous multicore processor, combining a general-purpose POWER architecture core with eight independent single-instructionmultiple-dat...
Sadaf R. Alam, Jeremy S. Meredith, Jeffrey S. Vett...
149
Voted
ISLPED
2004
ACM
169views Hardware» more  ISLPED 2004»
15 years 9 months ago
Delay optimal low-power circuit clustering for FPGAs with dual supply voltages
This paper presents a delay optimal FPGA clustering algorithm targeting low power. We assume that the configurable logic blocks of the FPGA can be programmed using either a high s...
Deming Chen, Jason Cong
163
Voted
CASES
2007
ACM
15 years 7 months ago
INTACTE: an interconnect area, delay, and energy estimation tool for microarchitectural explorations
Prior work on modeling interconnects has focused on optimizing the wire and repeater design for trading off energy and delay, and is largely based on low level circuit parameters....
Rahul Nagpal, Arvind Madan, Bharadwaj Amrutur, Y. ...