Sciweavers

HPCA
2006
IEEE
14 years 12 months ago
Last level cache (LLC) performance of data mining workloads on a CMP - a case study of parallel bioinformatics workloads
With the continuing growth in the amount of genetic data, members of the bioinformatics community are developing a variety of data-mining applications to understand the data and d...
Aamer Jaleel, Matthew Mattina, Bruce L. Jacob
HPCA
2006
IEEE
14 years 12 months ago
Retention-aware placement in DRAM (RAPID): software methods for quasi-non-volatile DRAM
Measurements of an off-the-shelf DRAM chip confirm that different cells retain information for different amounts of time. This result extends to DRAM rows, or pages (retention tim...
Ravi K. Venkatesan, Stephen Herr, Eric Rotenberg
HPCA
2006
IEEE
14 years 12 months ago
Exploiting parallelism and structure to accelerate the simulation of chip multi-processors
Simulation is an important means of evaluating new microarchitectures. Current trends toward chip multiprocessors (CMPs) try the ability of designers to develop efficient simulato...
David A. Penry, Daniel Fay, David Hodgdon, Ryan We...
HPCA
2006
IEEE
14 years 12 months ago
Software-hardware cooperative memory disambiguation
In high-end processors, increasing the number of in-flight instructions can improve performance by overlapping useful processing with long-latency accesses to the main memory. Buf...
Ruke Huang, Alok Garg, Michael C. Huang
HPCA
2006
IEEE
14 years 12 months ago
The common case transactional behavior of multithreaded programs
Transactional memory (TM) provides an easy-to-use and high-performance parallel programming model for the upcoming chip-multiprocessor systems. Several researchers have proposed a...
JaeWoong Chung, Hassan Chafi, Chi Cao Minh, Austen...
HPCA
2006
IEEE
14 years 12 months ago
Efficient instruction schedulers for SMT processors
We propose dynamic scheduler designs to improve the scheduler scalability and reduce its complexity in the SMT processors. Our first design is an adaptation of the recently propos...
Joseph J. Sharkey, Dmitry V. Ponomarev
HPCA
2006
IEEE
14 years 12 months ago
CORD: cost-effective (and nearly overhead-free) order-recording and data race detection
Chip-multiprocessors are becoming the dominant vehicle for general-purpose processing, and parallel software will be needed to effectively utilize them. This parallel software is ...
Milos Prvulovic
HPCA
2006
IEEE
14 years 12 months ago
LogTM: log-based transactional memory
Transactional memory (TM) simplifies parallel programming by guaranteeing that transactions appear to execute atomically and in isolation. Implementing these properties includes p...
Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan...
HPCA
2006
IEEE
14 years 12 months ago
An approach for implementing efficient superscalar CISC processors
An integrated, hardware / software co-designed CISC processor is proposed and analyzed. The objectives are high performance and reduced complexity. Although the x86 ISA is targete...
Shiliang Hu, Ilhyun Kim, Mikko H. Lipasti, James E...
HPCA
2006
IEEE
14 years 12 months ago
Completely verifying memory consistency of test program executions
An important means of validating the design of commercial-grade shared memory multiprocessors is to run a large number of pseudo-random test programs on them. However, when intent...
Chaiyasit Manovit, Sudheendra Hangal