Sciweavers

901 search results - page 124 / 181
» Hiding Communication Latency in Data Parallel Applications
Sort
View
MICRO
2005
IEEE
130views Hardware» more  MICRO 2005»
14 years 1 months ago
Exploiting Vector Parallelism in Software Pipelined Loops
An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditiona...
Samuel Larsen, Rodric M. Rabbah, Saman P. Amarasin...
CGO
2009
IEEE
14 years 2 months ago
Software Pipelined Execution of Stream Programs on GPUs
—The StreamIt programming model has been proposed to exploit parallelism in streaming applications on general purpose multicore architectures. This model allows programmers to sp...
Abhishek Udupa, R. Govindarajan, Matthew J. Thazhu...
IPPS
2007
IEEE
14 years 2 months ago
Programming Distributed Memory Sytems Using OpenMP
OpenMP has emerged as an important model and language extension for shared-memory parallel programming. On shared-memory platforms, OpenMP offers an intuitive, incremental approac...
Ayon Basumallik, Seung-Jai Min, Rudolf Eigenmann
PPOPP
2010
ACM
14 years 5 months ago
Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs?
Most modern Chip Multiprocessors (CMP) feature shared cache on chip. For multithreaded applications, the sharing reduces communication latency among co-running threads, but also r...
Eddy Z. Zhang, Xipeng Shen, Yunlian Jiang
ICDCS
2000
IEEE
14 years 8 days ago
Scheduling Heuristics for Data Requests in an Oversubscribed Network with Priorities and Deadlines
Providing up-to-date input to users’ applications is an important data management problem for a distributed computing environment, where each data storage location and intermedi...
Mitchell D. Theys, Noah Beck, Howard Jay Siegel, M...