Automated code generation and performance tuning techniques for concurrent architectures such as GPUs, Cell and FPGAs can provide integer factor speedups over multi-core processor...
The Merrimac supercomputer uses stream processors and a highradix network to achieve high performance at low cost and low power. The stream architecture matches the capabilities o...
Mattan Erez, Jung Ho Ahn, Ankit Garg, William J. D...
During the last half-decade, a number of research efforts have centered around developing software for generating automatically tuned matrix multiplication kernels. These include ...
John A. Gunnels, Fred G. Gustavson, Greg Henry, Ro...
Performance tuning is an important and time consuming task which may have to be repeated for each new application and platform. Although iterative optimisation can automate this p...
This paper describes the optimal tuning for the output limits of the power system stabilizer (PSS), which can improve the system damping performance immediately following a large d...
Seung-Mook Baek, Jung-Wook Park, Ganesh K. Venayag...