Sciweavers

685 search results - page 99 / 137
» Performance of Runtime Optimization on BLAST
Sort
View
CGO
2010
IEEE
14 years 1 months ago
Efficient compilation of fine-grained SPMD-threaded programs for multicore CPUs
In this paper we describe techniques for compiling finegrained SPMD-threaded programs, expressed in programming models such as OpenCL or CUDA, to multicore execution platforms. Pr...
John A. Stratton, Vinod Grover, Jaydeep Marathe, B...
ICCAD
1993
IEEE
104views Hardware» more  ICCAD 1993»
14 years 1 months ago
Parallel timing simulation on a distributed memory multiprocessor
Circuit simulation is one of the most computationally expensive tasks in circuit design and optimization. Detailed simulation at the level of precision of SPICE is usually perform...
Chih-Po Wen, Katherine A. Yelick
CASES
2007
ACM
14 years 29 days ago
Fragment cache management for dynamic binary translators in embedded systems with scratchpad
Dynamic binary translation (DBT) has been used to achieve numerous goals (e.g., better performance) for general-purpose computers. Recently, DBT has also attracted attention for e...
José Baiocchi, Bruce R. Childers, Jack W. D...
CC
2008
Springer
172views System Software» more  CC 2008»
13 years 11 months ago
Efficient Context-Sensitive Shape Analysis with Graph Based Heap Models
The performance of heap analysis techniques has a significant impact on their utility in an optimizing compiler. Most shape analysis techniques perform interprocedural dataflow ana...
Mark Marron, Manuel V. Hermenegildo, Deepak Kapur,...
CF
2008
ACM
13 years 11 months ago
DMA-based prefetching for i/o-intensive workloads on the cell architecture
Recent advent of the asymmetric multi-core processors such as Cell Broadband Engine (Cell/BE) has popularized the use of heterogeneous architectures. A growing body of research is...
M. Mustafa Rafique, Ali Raza Butt, Dimitrios S. Ni...