Sciweavers

244 search results - page 10 / 49
» Optimizing Loop Performance for Clustered VLIW Architectures
Sort
View
ICPPW
2008
IEEE
14 years 2 months ago
Performance Analysis and Optimization of Parallel Scientific Applications on CMP Cluster Systems
Chip multiprocessors (CMP) are widely used for high performance computing. Further, these CMPs are being configured in a hierarchical manner to compose a node in a cluster system....
Xingfu Wu, Valerie E. Taylor, Charles W. Lively, S...
RTCSA
2006
IEEE
14 years 1 months ago
Integrating Compiler and System Toolkit Flow for Embedded VLIW DSP Processors
To support high-performance and low-power for multimedia applications and for hand-held devices, embedded VLIW DSP processors are of research focus. With the tight resource constr...
Chi Wu, Kun-Yuan Hsieh, Yung-Chia Lin, Chung-Ju Wu...
IPPS
1998
IEEE
13 years 11 months ago
Comparing the Optimal Performance of Different MIMD Multiprocessor Architectures
We compare the performance of systems consisting of one large cluster containing q processors with systems where processors are grouped into k clusters containing u processors eac...
Lars Lundberg, Håkan Lennerstad
IPPS
1996
IEEE
13 years 12 months ago
A Method for Register Allocation to Loops in Multiple Register File Architectures
Multiple instruction issue processors place high demands on register file bandwidth. One solution to reduce this bottleneck is the use of multiple register files. Register allocat...
David J. Kolson, Alexandru Nicolau, Nikil D. Dutt,...
CASES
2004
ACM
14 years 1 months ago
A low power architecture for embedded perception
Recognizing speech, gestures, and visual features are important interface capabilities for future embedded mobile systems. Unfortunately, the real-time performance requirements of...
Binu K. Mathew, Al Davis, Michael Parker