Abstract. This paper presents a study of performance optimization of dense matrix multiplication on IBM Cyclops-64(C64) chip architecture. Although much has been published on how t...
Ziang Hu, Juan del Cuvillo, Weirong Zhu, Guang R. ...
—This paper reports a study of mapping the Finite Difference Time Domain (FDTD) application to the IBM Cyclops64 (C64) many-core chip architecture [1]. C64 is chosen for this stu...
We explore different prefetch distance-degree combinations and very simple, low-cost adaptive policies on a superscalar core with a high bandwidth, high capacity on-chip memory hie...
In this paper we study the impact of sharing memory resources on five Google datacenter applications: a web search engine, bigtable, content analyzer, image stitching, and protoc...
Lingjia Tang, Jason Mars, Neil Vachharajani, Rober...
Nested paging is a hardware solution for alleviating the software memory management overhead imposed by system virtualization. Nested paging complements existing page walk hardwar...
Ravi Bhargava, Ben Serebrin, Francesco Spadini, Sr...