We investigate factors that impact the effectiveness of caching to speed up discrete event simulation. Walsh and Sirer have shown that a variant of function caching (staged simula...
Packet delay greatly influences the overall performance of network applications. It is therefore important to identify causes and location of delay performance degradation within ...
Francesco Lo Presti, Nick G. Duffield, Joseph Horo...
Most modern Chip Multiprocessors (CMP) feature shared cache on chip. For multithreaded applications, the sharing reduces communication latency among co-running threads, but also r...
Existing supercomputers have hundreds of thousands of processor cores, and future systems may have hundreds of millions. Developers need detailed performance measurements to tune ...
Todd Gamblin, Bronis R. de Supinski, Martin Schulz...
Automatic tuning has emerged as a solution to provide high-performance libraries for fast changing, increasingly complex computer architectures. We distinguish offline adaptation (...