The performance of the MPI’s collective communications is critical in most MPI-based applications. A general algorithm for a given collective communication operation may not giv...
Sathish S. Vadhiyar, Graham E. Fagg, Jack Dongarra
Abstract. Current multicore computers differ in many hardware aspects. Tuning parallel applications is indispensable to achieve best performance on a particular hardware platform....
Frank Otto, Christoph A. Schaefer, Matthias Dempe,...
Detecting the presence of musical sounds in broadcast audio is important for content-based indexing and retrieval of auditory and visual information in radio and TV programs. In t...
Cache misses form a major bottleneck for memory-intensive applications, due to the significant latency of main memory accesses. Loop tiling, in conjunction with other program tran...
In recent years, several approaches have been proposed to use profile information in compiler optimization. This profile information can be used at the source level to guide loo...
Masayo Haneda, Peter M. W. Knijnenburg, Harry A. G...