This paper explores the relation between the structured parallelism exposed by the Decomposable BSP (DBSP) model through submachine locality and locality of reference in multi-lev...
Andrea Pietracaprina, Geppino Pucci, Francesco Sil...
Cache misses form a major bottleneck for memory-intensive applications, due to the significant latency of main memory accesses. Loop tiling, in conjunction with other program tran...
Application-specific system-on-chip platforms create the opportunity to customize the cache configuration for optimal performance with minimal chip estate. Simulation, in partic...
On-chip memory organization is one of the most important aspects that can influence the overall system behavior in multiprocessor systems. Following the trend set by high-perform...
Mohamed M. Sabry, Martino Ruggiero, Pablo Garcia D...
The emerging trend of larger number of cores or processors on a single chip in the server, desktop, and mobile notebook platforms necessarily demands larger amount of on-chip last...