Many task-based programming models have been developed and refined in recent years to support application development for shared memory platforms. Asynchronous tasks are a powerfu...
James LaGrone, Ayodunni Aribuki, Cody Addison, Bar...
This paper presents a technique for memory optimization for a class of computations that arises in the field of correlated electronic structure methods such as coupled cluster and...
A. Allam, J. Ramanujam, Gerald Baumgartner, P. Sad...
Estimation of static and dynamic energy of caches is critical for high-performance low-power designs. Commercial CAD tools performing energy estimation statically are not aware of...
Shrikanth Ganapathy, Ramon Canal, Antonio Gonz&aac...
Sparse matrix operations achieve only small fractions of peak CPU speeds because of the use of specialized, indexbased matrix representations, which degrade cache utilization by i...
This paper describes the precise speci cation, design, analysis, implementation, and measurements of an e cient algorithm for solving regular tree grammar based constraints. The p...