The trend towards integrating multiple cores on the same die has accentuated the need for larger on-chip caches. Such large caches are constructed as a multitude of smaller cache ...
Reetuparna Das, Asit K. Mishra, Chrysostomos Nicop...
This work presents the first extensive study of singlenode performance optimization, tuning, and analysis of the fast multipole method (FMM) on modern multicore systems. We consid...
Aparna Chandramowlishwaran, Samuel Williams, Leoni...
Many large scale applications, have significant I/O requirements as well as computational and memory requirements. Unfortunately, limited number of I/O nodes provided by the conte...
Meenakshi A. Kandaswamy, Mahmut T. Kandemir, Alok ...
SIMD extension is one of the most common and effective technique to exploit data-level parallelism in today’s processor designs. However, the performance of SIMD architectures i...
Performance evaluation studies are to be an integral part of the design and tuning of parallel applications. We propose a hierarchical approach to the systematic characterization o...
Maria Calzarossa, Alessandro P. Merlo, Daniele Tes...