Sciweavers

527 search results - page 35 / 106
» Characteristics of workloads used in high performance and te...
Sort
View
JSSPP
2004
Springer
14 years 2 months ago
Reconfigurable Gang Scheduling Algorithm
 Using a single traditional gang scheduling algorithm cannot provide the best performance for all workloads and parallel architectures. A solution for this problem is the use of...
Luís Fabrício Wanderley Góes,...
HIPC
2009
Springer
13 years 6 months ago
Continuous performance monitoring for large-scale parallel applications
Traditional performance analysis techniques are performed after a parallel program has completed. In this paper, we describe an online method for continuously monitoring the perfor...
Isaac Dooley, Chee Wai Lee, Laxmikant V. Kal&eacut...
ICCS
2005
Springer
14 years 2 months ago
Performance and Scalability Analysis of Cray X1 Vectorization and Multistreaming Optimization
Cray X1 Fortran and C/C++ compilers provide a number of loop transformations, notably vectorization and multistreaming, in order to exploit the multistreaming processor (MSP) hard...
Sadaf R. Alam, Jeffrey S. Vetter
CONEXT
2008
ACM
13 years 10 months ago
Distributed content delivery using load-aware network coordinates
To scale to millions of Internet users with good performance, content delivery networks (CDNs) must balance requests between content servers while assigning clients to nearby serv...
Nicholas Ball, Peter R. Pietzuch
DAC
2012
ACM
11 years 11 months ago
A QoS-aware memory controller for dynamically balancing GPU and CPU bandwidth use in an MPSoC
Diverse IP cores are integrated on a modern system-on-chip and share resources. Off-chip memory bandwidth is often the scarcest resource and requires careful allocation. Two of t...
Min Kyu Jeong, Mattan Erez, Chander Sudanthi, Nige...