The deluge of available data for analysis demands the need to scale the performance of data mining implementations. With the current architectural trends, one of the major challen...
Hybrid architectures combining the strengths of generalpurpose processors with application-specific hardware accelerators can lead to a significant performance improvement. Our ...
—Communication traces are integral to performance modeling and analysis of parallel programs. However, execution on a large number of nodes results in a large trace volume that i...
This work presents the first extensive study of singlenode performance optimization, tuning, and analysis of the fast multipole method (FMM) on modern multicore systems. We consid...
Aparna Chandramowlishwaran, Samuel Williams, Leoni...
As symmetric multiprocessors become commonplace, the interconnection networks and the communication system software in clusters of multiprocessors become critical to achieving high...
Ying Qian, Ahmad Afsahi, Nathan R. Fredrickson, Re...