Sciweavers

656 search results - page 96 / 132
» Scalable Parallel Matrix Multiplication on Distributed Memor...
Sort
View
IPPS
1996
IEEE
14 years 25 days ago
A New Approach to Pipeline FFT Processor
A new VLSI architecture for real-time pipeline FFT processor is proposed. A hardware oriented radix-22 algorithm is derived by integrating a twiddle factor decomposition technique ...
Shousheng He, Mats Torkelson
IPPS
2007
IEEE
14 years 3 months ago
A Flexible Resource Management Architecture for the Blue Gene/P Supercomputer
Blue Gene R /P is a massively parallel supercomputer intended as the successor to Blue Gene/L. It leverages much of the existing architecture of its predecessor to provide scalabi...
Sam Miller, Mark Megerian, Paul Allen, Tom Budnik
IWCC
1999
IEEE
14 years 29 days ago
Optimizing User-Level Communication Patterns on the Fujitsu AP3000
In this paper, we present techniques and algorithms to improve the performance of various communication patterns on message-passing platforms where, for reasons of safety, user-le...
Jeremy E. Dawson, Peter E. Strazdins
CGO
2006
IEEE
14 years 2 months ago
Compiler-directed Data Partitioning for Multicluster Processors
Multicluster architectures overcome the scaling problem of centralized resources by distributing the datapath, register file, and memory subsystem across multiple clusters connec...
Michael L. Chu, Scott A. Mahlke
HIPC
2000
Springer
14 years 8 days ago
Meta-data Management System for High-Performance Large-Scale Scientific Data Access
Many scientific applications manipulate large amount of data and, therefore, are parallelized on high-performance computing systems to take advantage of their computational power a...
Wei-keng Liao, Xiaohui Shen, Alok N. Choudhary