Sciweavers

39 search results - page 3 / 8
» Compiler Generated Multithreading to Alleviate Memory Latenc...
Sort
View
MIDDLEWARE
2010
Springer
13 years 5 months ago
Automatically Generating Symbolic Prefetches for Distributed Transactional Memories
Abstract. Developing efficient distributed applications while managing complexity can be challenging. Managing network latency is a key challenge for distributed applications. We ...
Alokika Dash, Brian Demsky
PLDI
1995
ACM
13 years 11 months ago
Improving Balanced Scheduling with Compiler Optimizations that Increase Instruction-Level Parallelism
Traditional list schedulers order instructions based on an optimistic estimate of the load latency imposed by the hardware and therefore cannot respond to variations in memory lat...
Jack L. Lo, Susan J. Eggers
ASPLOS
2008
ACM
13 years 9 months ago
Communication optimizations for global multi-threaded instruction scheduling
The recent shift in the industry towards chip multiprocessor (CMP) designs has brought the need for multi-threaded applications to mainstream computing. As observed in several lim...
Guilherme Ottoni, David I. August
ICS
2009
Tsinghua U.
14 years 2 months ago
Computer generation of fast fourier transforms for the cell broadband engine
The Cell BE is a multicore processor with eight vector accelerators (called SPEs) that implement explicit cache management through direct memory access engines. While the Cell has...
Srinivas Chellappa, Franz Franchetti, Markus P&uum...
IPPS
2006
IEEE
14 years 1 months ago
Compiler assisted dynamic management of registers for network processors
Modern network processors support high levels of parallelism in packet processing by supporting multiple threads that execute on a micro-engine. Threads switch context upon encoun...
R. Collins, Fernando Alegre, Xiaotong Zhuang, Sant...