Sciweavers

1022 search results - page 185 / 205
» Automatic data and computation decomposition on distributed ...
Sort
View
IEEEPACT
2008
IEEE
14 years 2 months ago
The PARSEC benchmark suite: characterization and architectural implications
This paper presents and characterizes the Princeton Application Repository for Shared-Memory Computers (PARSEC), a benchmark suite for studies of Chip-Multiprocessors (CMPs). Prev...
Christian Bienia, Sanjeev Kumar, Jaswinder Pal Sin...
HPCA
2009
IEEE
14 years 8 months ago
PageNUCA: Selected policies for page-grain locality management in large shared chip-multiprocessor caches
As the last-level on-chip caches in chip-multiprocessors increase in size, the physical locality of on-chip data becomes important for delivering high performance. The non-uniform...
Mainak Chaudhuri
ICS
1998
Tsinghua U.
13 years 11 months ago
Load Execution Latency Reduction
In order to achieve high performance, contemporary microprocessors must effectively process the four major instruction types: ALU, branch, load, and store instructions. This paper...
Bryan Black, Brian Mueller, Stephanie Postal, Ryan...
ICS
2009
Tsinghua U.
14 years 2 months ago
Zero-content augmented caches
It has been observed that some applications manipulate large amounts of null data. Moreover these zero data often exhibit high spatial locality. On some applications more than 20%...
Julien Dusser, Thomas Piquet, André Seznec
CLUSTER
2008
IEEE
14 years 2 months ago
Live and incremental whole-system migration of virtual machines using block-bitmap
—In this paper, we describe a whole-system live migration scheme, which transfers the whole system run-time state, including CPU state, memory data, and local disk storage, of th...
Yingwei Luo, Binbin Zhang, Xiaolin Wang, Zhenlin W...