Sciweavers

230 search results - page 16 / 46
» Performance Analysis of Parallel N-Body Codes
Sort
View
PLDI
2010
ACM
14 years 22 days ago
A GPGPU compiler for memory optimization and parallelism management
This paper presents a novel optimizing compiler for general purpose computation on graphics processing units (GPGPU). It addresses two major challenges of developing high performa...
Yi Yang, Ping Xiang, Jingfei Kong, Huiyang Zhou
FGCS
2006
113views more  FGCS 2006»
13 years 7 months ago
Performance feature identification by comparative trace analysis
This work introduces a method for instrumenting applications, producing execution traces, and visualizing multiple trace instances to identify performance features. The approach p...
Daniel P. Spooner, Darren J. Kerbyson
IPPS
2003
IEEE
14 years 29 days ago
A Compilation Framework for Distributed Memory Parallelization of Data Mining Algorithms
With the availability of large datasets in a variety of scientific and commercial domains, data mining has emerged as an important area within the last decade. Data mining techni...
Xiaogang Li, Ruoming Jin, Gagan Agrawal
FCCM
2011
IEEE
220views VLSI» more  FCCM 2011»
12 years 11 months ago
Reducing the Energy Cost of Irregular Code Bases in Soft Processor Systems
— This paper describes an architecture and FPGA synthesis toolchain for building specialized, energy-saving coprocessors called Irregular Code Energy Reducers (ICERs) for a wide ...
Manish Arora, Jack Sampson, Nathan Goulding-Hotta,...
CC
2012
Springer
250views System Software» more  CC 2012»
12 years 3 months ago
Improving Performance of OpenCL on CPUs
Abstract. Data-parallel languages like OpenCL and CUDA are an important means to exploit the computational power of today’s computing devices. In this paper, we deal with two asp...
Ralf Karrenberg, Sebastian Hack