hardware | Sciweavers

52

IEEEPACT
2009
IEEE

164views Distributed And Parallel Com...» more IEEEPACT 2009»

Soft-OLP: Improving Hardware Cache Performance through Software-Controlled Object-Level Partitioning

15 years 14 days ago

—Performance degradation of memory-intensive programs caused by the LRU policy’s inability to handle weaklocality data accesses in the last level cache is increasingly serious ...

Qingda Lu, Jiang Lin, Xiaoning Ding, Zhao Zhang, X...

claim paper

Read More »

53

click to vote

IEEEPACT
2009
IEEE

166views Distributed And Parallel Com...» more IEEEPACT 2009»

Mapping Out a Path from Hardware Transactional Memory to Speculative Multithreading

15 years 14 days ago

Download cseweb.ucsd.edu

— This research demonstrates that coming support for hardware transactional memory can be leveraged to signiﬁcantly reduce the cost of implementing true speculative multithread...

Leo Porter, Bumyong Choi, Dean M. Tullsen

claim paper

Read More »

54

click to vote

IEEEPACT
2009
IEEE

210views Distributed And Parallel Com...» more IEEEPACT 2009»

Analytical Modeling of Pipeline Parallelism

15 years 14 days ago

Download domino.research.ibm.com

Parallel programming is a requirement in the multi-core era. One of the most promising techniques to make parallel programming available for the general users is the use of parall...

Angeles G. Navarro, Rafael Asenjo, Siham Tabik, Ca...

claim paper

Read More »

50

click to vote

IEEEPACT
2009
IEEE

219views Distributed And Parallel Com...» more IEEEPACT 2009»

Automatic Tuning of Discrete Fourier Transforms Driven by Analytical Modeling

15 years 14 days ago

Download www.des.udc.es

—Analytical models have been used to estimate optimal values for parameters such as tile sizes in the context of loop nests. However, important algorithms such as fast Fourier tr...

Basilio B. Fraguela, Yevgen Voronenko, Markus P&uu...

claim paper

Read More »

42

click to vote

IEEEPACT
2009
IEEE

135views Distributed And Parallel Com...» more IEEEPACT 2009»

CPROB: Checkpoint Processing with Opportunistic Minimal Recovery

15 years 14 days ago

Download www.cis.upenn.edu

—CPR (Checkpoint Processing and Recovery) is a physical register management scheme that supports a larger instruction window and higher average IPC than conventional ROB-style re...

Andrew D. Hilton, Neeraj Eswaran, Amir Roth

claim paper

Read More »

54

click to vote

IEEEPACT
2009
IEEE

157views Distributed And Parallel Com...» more IEEEPACT 2009»

StealthTest: Low Overhead Online Software Testing Using Transactional Memory

15 years 14 days ago

Download www.cs.wisc.edu

—Software testing is hard. The emergence of multicore architectures and the proliferation of bugprone multithreaded software makes testing even harder. To this end, researchers h...

Jayaram Bobba, Weiwei Xiong, Luke Yen, Mark D. Hil...

claim paper

Read More »

69

click to vote

IEEEPACT
2009
IEEE

237views Distributed And Parallel Com...» more IEEEPACT 2009»

Polyhedral-Model Guided Loop-Nest Auto-Vectorization

15 years 14 days ago

Download www-rocq.inria.fr

Abstract—Optimizing compilers apply numerous interdependent optimizations, leading to the notoriously difﬁcult phase-ordering problem — that of deciding which transformations...

Konrad Trifunovic, Dorit Nuzman, Albert Cohen, Aya...

claim paper

Read More »

55

click to vote

IEEEPACT
2009
IEEE

212views Distributed And Parallel Com...» more IEEEPACT 2009»

Data Layout Transformation for Enhancing Data Locality on NUCA Chip Multiprocessors

15 years 14 days ago

Download www.cse.ohio-state.edu

—With increasing numbers of cores, future CMPs (Chip Multi-Processors) are likely to have a tiled architecture with a portion of shared L2 cache on each tile and a bankinterleave...

Qingda Lu, Christophe Alias, Uday Bondhugula, Thom...

claim paper

Read More »

50

click to vote

IEEEPACT
2009
IEEE

178views Distributed And Parallel Com...» more IEEEPACT 2009»

Architecture Support for Improving Bulk Memory Copying and Initialization Performance

15 years 14 days ago

Download www.ece.ncsu.edu

—Bulk memory copying and initialization is one of the most ubiquitous operations performed in current computer systems by both user applications and Operating Systems. While many...

Xiaowei Jiang, Yan Solihin, Li Zhao, Ravishankar I...

claim paper

Read More »

62

click to vote

IEEEPACT
2009
IEEE

211views Distributed And Parallel Com...» more IEEEPACT 2009»

Anaphase: A Fine-Grain Thread Decomposition Scheme for Speculative Multithreading

15 years 14 days ago

Download arco.e.ac.upc.edu

Industry is moving towards multi-core designs as we have hit the memory and power walls. Multi-core designs are very effective to exploit thread-level parallelism (TLP) but do not...

Carlos Madriles, Pedro López, Josep M. Codi...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers