This paper presents a new superscalar architecture for fast discrete cosine transform (DCT). Comparing with the general SIMD architecture, it speeds up the DCT computation by a fac...
Abstract. Designing and tuning parallel applications with MPI, particularly at large scale, requires understanding the performance implications of different choices of algorithms ...
Torsten Hoefler, William Gropp, Rajeev Thakur, Jes...
Abstract--This paper examines the initial parallel implementation of SCATTER, a computationally intensive inelastic neutron scattering routine with polycrystalline averaging capabi...
The availability of high speed networks and improved microprocessor performance have made it possible to build inexpensive cluster of workstations as an appealing platform for par...
Azzedine Boukerche, Sajal K. Das, Ajoy Kumar Datta...