Sciweavers

307 search results - page 55 / 62
» Automated Performance Measurement of Parallel Programs
Sort
View
ICS
2007
Tsinghua U.
14 years 2 months ago
Optimization of data prefetch helper threads with path-expression based statistical modeling
This paper investigates helper threads that improve performance by prefetching data on behalf of an application’s main thread. The focus is data prefetch helper threads that lac...
Tor M. Aamodt, Paul Chow
ISPASS
2010
IEEE
13 years 6 months ago
Weak execution ordering - exploiting iterative methods on many-core GPUs
Abstract--On NVIDIA's many-core GPUs, there is no synchronization function among parallel thread blocks. When finegranularity of data communication and synchronization is requ...
Jianmin Chen, Zhuo Huang, Feiqi Su, Jih-Kwon Peir,...
SIGOPS
2010
162views more  SIGOPS 2010»
13 years 7 months ago
Visual and algorithmic tooling for system trace analysis: a case study
Despite advances in the application of automated statistical and machine learning techniques to system log and trace data there will always be a need for human analysis of machine...
Wim De Pauw, Steve Heisig
CODES
2005
IEEE
14 years 2 months ago
SOMA: a tool for synthesizing and optimizing memory accesses in ASICs
Arbitrary memory dependencies and variable latency memory systems are major obstacles to the synthesis of large-scale ASIC systems in high-level synthesis. This paper presents SOM...
Girish Venkataramani, Tiberiu Chelcea, Seth Copen ...
ICPP
2003
IEEE
14 years 1 months ago
Data Conversion for Process/Thread Migration and Checkpointing
Process/thread migration and checkpointing schemes support load balancing, load sharing and fault tolerance to improve application performance and system resource usage on worksta...
Hai Jiang, Vipin Chaudhary, John Paul Walters