Sciweavers

1510 search results - page 188 / 302
» The distributed Karhunen-Loeve transform
Sort
View
EUROPAR
2009
Springer
14 years 4 months ago
Process Mapping for MPI Collective Communications
It is an important problem to map virtual parallel processes to physical processors (or cores) in an optimized way to get scalable performance due to non-uniform communication cost...
Jin Zhang, Jidong Zhai, Wenguang Chen, Weimin Zhen...
PDP
2008
IEEE
14 years 4 months ago
Scheduling of QR Factorization Algorithms on SMP and Multi-Core Architectures
This paper examines the scalable parallel implementation of QR factorization of a general matrix, targeting SMP and multi-core architectures. Two implementations of algorithms-by-...
Gregorio Quintana-Ortí, Enrique S. Quintana...
SIPS
2008
IEEE
14 years 4 months ago
Efficient mapping of advanced signal processing algorithms on multi-processor architectures
Modern microprocessor technology is migrating from simply increasing clock speeds on a single processor to placing multiple processors on a die to increase throughput and power pe...
Bhavana B. Manjunath, Aaron S. Williams, Chaitali ...
ICPP
2007
IEEE
14 years 4 months ago
Real-Time Divisible Load Scheduling with Different Processor Available Times
Providing QoS and performance guarantees to arbitrarily divisible loads has become a significant problem for many cluster-based research computing facilities. While progress is b...
Xuan Lin, Ying Lu, Jitender S. Deogun, Steve Godda...
IPPS
2007
IEEE
14 years 4 months ago
Model-Guided Empirical Optimization for Multimedia Extension Architectures: A Case Study
Compiler technology for multimedia extensions must effectively utilize not only the SIMD compute engines but also the various levels of the memory hierarchy: superword registers,...
Chun Chen, Jaewook Shin, Shiva Kintali, Jacqueline...