In this paper, we investigate the data access patterns and file I/O behaviors of a production cosmology application that uses the adaptive mesh refinement (AMR) technique for it...
Jianwei Li, Wei-keng Liao, Alok N. Choudhary, Vale...
—This work tries to derive ideas for thread allocation in Chip Multiprocessor (CMP)-based network processors performing general applications by Continuous-Time Markov Chain model...
This paper presents an algorithm to automatically map code on a generic intelligent memory system that consists of a host processor and a simpler memory processor. To achieve high...
Scientific, symbolic, and multimedia applications present diverse computing workloads with different types of inherent parallelism. Tomorrow’s processors will employ varying com...
Linda M. Wills, Tarek M. Taha, Lewis B. Baumstark ...
This paper describes an implementation of parallel LU factorization. The focus is to achieve high performance on non-dedicated clusters, where the number of available computing re...