Sciweavers

197 search results - page 26 / 40
» Detecting phases in parallel applications on shared memory a...
Sort
View
PPOPP
2009
ACM
14 years 8 months ago
A compiler-directed data prefetching scheme for chip multiprocessors
Data prefetching has been widely used in the past as a technique for hiding memory access latencies. However, data prefetching in multi-threaded applications running on chip multi...
Dhruva Chakrabarti, Mahmut T. Kandemir, Mustafa Ka...
HPCA
2009
IEEE
14 years 8 months ago
Design and implementation of software-managed caches for multicores with local memory
Heterogeneous multicores, such as Cell BE processors and GPGPUs, typically do not have caches for their accelerator cores because coherence traffic, cache misses, and latencies fr...
Sangmin Seo, Jaejin Lee, Zehra Sura
ISPAN
2009
IEEE
14 years 2 months ago
Vector Bank Based Multimedia Codec System-on-a-Chip (SoC) Design
—In this paper, we present a design architecture of implementing a ”Vector Bank” into video encoder system, namely, an H.264 encoder, in order to detect and analyze the movin...
Ruei-Xi Chen, Wei Zhao, Jeffrey Fan, Asad Davari
IPPS
1995
IEEE
13 years 11 months ago
Operating system support for concurrent remote task creation
This paper describes improvements to the Mach microkernel’s support for efficient application startup across multiple nodes in a cluster or massively parallel processor. Signifi...
Dejan S. Milojicic, David L. Black, Steven J. Sear...
ICPP
2000
IEEE
13 years 12 months ago
Match Virtual Machine: An Adaptive Runtime System to Execute MATLAB in Parallel
MATLAB is one of the most popular languages for desktop numerical computations as well as for signal and image processing applic ations. Applying parallel processing techniques to...
Malay Haldar, Anshuman Nayak, Abhay Kanhere, Pramo...