—This paper reports a study of mapping the Finite Difference Time Domain (FDTD) application to the IBM Cyclops64 (C64) many-core chip architecture [1]. C64 is chosen for this stu...
—The increasing demand for resources of the high performance computing systems has led to new forms of collaboration of distributed systems such as interoperable grid systems tha...
Parallel simulation is a technique to accelerate microarchitecture simulation of CMPs by exploiting the inherent parallelism of CMPs. In this paper, we explore the simulation para...
Memory-intensive threads can hoard shared resources without making progress on a multithreading processor (SMT), thereby hindering the overall system performance. A recent promisi...
—Recent advances in DNA sequencing techniques have led to an unprecedented accumulation and availability of molecular sequence data that needs to be analyzed. This data explosion...
Abstract—This paper analyzes energy characteristics of parallel algorithms executed on scalable multicore processors. Specifically, we provide a methodology for evaluating energ...
Providing continuous on-demand streaming services with VCR functionality over ubiquitous environments is challenging due to the stringent QoS requirements of streaming service as ...
Qifeng Yu, Tianyin Xu, Baoliu Ye, Sanglu Lu, Daoxu...
Abstract—As Chip-Multiprocessor systems (CMP) have become the predominant topology for leading microprocessors, critical components of the system are now integrated on a single c...
Dimitris Kaseridis, Jeffrey Stuecheli, Lizy K. Joh...
— High-end computing (HEC) systems have passed the petaflop barrier and continue to move toward the next frontier of exascale computing. As companies and research institutes con...
Narayan Desai, Darius Buntinas, Daniel Buettner, P...
—The implementation and optimization of collective communication operations is an important field of active research. Such operations directly influence application performance...
Torsten Hoefler, Christian Siebert, Andrew Lumsdai...