Sciweavers

27 search results - page 4 / 6
» Memory efficient scheduling of Strassen-Winograd's matrix mu...
Sort
View
ICC
2007
IEEE
162views Communications» more  ICC 2007»
14 years 4 months ago
An Asymptotically Sum-Rate Optimal Precoding Scheme for MIMO Gaussian Broadcast Channel
—In this paper, we study the downlink precoding schemes for MIMO Gaussian broadcast channels (MIMO GBC). A novel low-complexity zero-forcing dirty-paper-coding (DPC) scheme, name...
Hao Li, Changqing Xu, Pingzhi Fan
SPAA
2006
ACM
14 years 3 months ago
The cache complexity of multithreaded cache oblivious algorithms
We present a technique for analyzing the number of cache misses incurred by multithreaded cache oblivious algorithms on an idealized parallel machine in which each processor has a...
Matteo Frigo, Volker Strumpen
PPL
2008
124views more  PPL 2008»
13 years 9 months ago
Experimental Evaluation of BSP Programming Libraries
The model of bulk-synchronous parallel computation (BSP) helps to implement portable general purpose algorithms while keeping predictable performance on different parallel compute...
Peter Krusche
ICCS
2009
Springer
13 years 7 months ago
Parallel MLEM on Multicore Architectures
Abstract. The efficient use of multicore architectures for sparse matrixvector multiplication (SpMV) is currently an open challenge. One algorithm which makes use of SpMV is the ma...
Tilman Küstner, Josef Weidendorfer, Jasmine S...
IPPS
1999
IEEE
14 years 2 months ago
Reducing I/O Complexity by Simulating Coarse Grained Parallel Algorithms
Block-wise access to data is a central theme in the design of efficient external memory (EM) algorithms. A second important issue, when more than one disk is present, is fully par...
Frank K. H. A. Dehne, David A. Hutchinson, Anil Ma...