Sciweavers

808 search results - page 84 / 162
» Data Communication and Parallel Computing on Twisted Hypercu...
Sort
View
TPDS
1998
98views more  TPDS 1998»
13 years 7 months ago
A Basic-Cycle Calculation Technique for Efficient Dynamic Data Redistribution
—Array redistribution is usually required to enhance algorithm performance in many parallel programs on distributed memory multicomputers. Since it is performed at run-time, ther...
Yeh-Ching Chung, Ching-Hsien Hsu, Sheng-Wen Bai
MICRO
2005
IEEE
130views Hardware» more  MICRO 2005»
14 years 1 months ago
Exploiting Vector Parallelism in Software Pipelined Loops
An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditiona...
Samuel Larsen, Rodric M. Rabbah, Saman P. Amarasin...
GCC
2005
Springer
14 years 1 months ago
Incorporating Data Movement into Grid Task Scheduling
Task Scheduling is a critical design issue of distributed computing. The emerging Grid computing infrastructure consists of heterogeneous resources in widely distributed autonomous...
Xiaoshan He, Xian-He Sun
CGO
2006
IEEE
14 years 1 months ago
Compiler-directed Data Partitioning for Multicluster Processors
Multicluster architectures overcome the scaling problem of centralized resources by distributing the datapath, register file, and memory subsystem across multiple clusters connec...
Michael L. Chu, Scott A. Mahlke
CCGRID
2004
IEEE
13 years 11 months ago
High performance LU factorization for non-dedicated clusters
This paper describes an implementation of parallel LU factorization. The focus is to achieve high performance on non-dedicated clusters, where the number of available computing re...
Toshio Endo, Kenji Kaneda, Kenjiro Taura, Akinori ...