Sciweavers

74 search results - page 10 / 15
» Data Access Partitioning for Fine-grain Parallelism on Multi...
Sort
View
HPCA
2006
IEEE
14 years 8 months ago
Last level cache (LLC) performance of data mining workloads on a CMP - a case study of parallel bioinformatics workloads
With the continuing growth in the amount of genetic data, members of the bioinformatics community are developing a variety of data-mining applications to understand the data and d...
Aamer Jaleel, Matthew Mattina, Bruce L. Jacob
EUROPAR
2010
Springer
13 years 8 months ago
Thread Owned Block Cache: Managing Latency in Many-Core Architecture
Abstract. Shared last level cache is crucial to performance. However, multithread program model incurs serious contention in shared cache. In this paper, to reduce average cache ac...
Fenglong Song, Zhiyong Liu, Dongrui Fan, Hao Zhang...
HPCA
2009
IEEE
14 years 8 months ago
Dynamic hardware-assisted software-controlled page placement to manage capacity allocation and sharing within large caches
In future multi-cores, large amounts of delay and power will be spent accessing data in large L2/L3 caches. It has been recently shown that OS-based page coloring allows a non-uni...
Manu Awasthi, Kshitij Sudan, Rajeev Balasubramonia...
EUROPAR
2008
Springer
13 years 9 months ago
MPC: A Unified Parallel Runtime for Clusters of NUMA Machines
Over the last decade, Message Passing Interface (MPI) has become a very successful parallel programming environment for distributed memory architectures such as clusters. However, ...
Marc Pérache, Hervé Jourdren, Raymon...
ISCA
2010
IEEE
199views Hardware» more  ISCA 2010»
14 years 15 days ago
A case for FAME: FPGA architecture model execution
Given the multicore microprocessor revolution, we argue that the architecture research community needs a dramatic increase in simulation capacity. We believe FPGA Architecture Mod...
Zhangxi Tan, Andrew Waterman, Henry Cook, Sarah Bi...