Sciweavers

901 search results - page 73 / 181
» Hiding Communication Latency in Data Parallel Applications
Sort
View
IPPS
2006
IEEE
14 years 2 months ago
Coterminous locality and coterminous group data prefetching on chip-multiprocessors
Due to shared cache contentions and interconnect delays, data prefetching is more critical in alleviating penalties from increasing memory latencies and demands on Chip-Multiproce...
Xudong Shi, Zhen Yang, Jih-Kwon Peir, Lu Peng, Yen...
VLSID
2004
IEEE
107views VLSI» more  VLSID 2004»
14 years 8 months ago
Performance Analysis of Inter Cluster Communication Methods in VLIW Architecture
With increasing demands for high performance by embedded systems, especially by digital signal processing applications, embedded processors must increase available instruction lev...
Sourabh Saluja, Anshul Kumar
CLUSTER
2008
IEEE
14 years 2 months ago
In search of sweet-spots in parallel performance monitoring
—Parallel performance monitoring extends parallel measurement systems with infrastructure and interfaces for online performance data access, communication, and analysis. At the s...
Aroon Nataraj, Allen D. Malony, Allen Morris, Dori...
CASES
2007
ACM
14 years 14 hour ago
Light-weight synchronization for inter-processor communication acceleration on embedded MPSoCs
Advances in semiconductor technologies have placed MPSoCs center stage as a standard architecture for embedded applications of ever increasing complexity. Efficient utilization of...
Chengmo Yang, Alex Orailoglu
FPL
2004
Springer
164views Hardware» more  FPL 2004»
13 years 11 months ago
Dynamic Prefetching in the Virtual Memory Window of Portable Reconfigurable Coprocessors
Abstract. In Reconfigurable Systems-On-Chip (RSoCs), operating systems can primarily (1) manage the sharing of limited reconfigurable resources, and (2) support communication betwe...
Miljan Vuletic, Laura Pozzi, Paolo Ienne