Sciweavers

363 search results - page 68 / 73
» Optimizing Memory Accesses For Spatial Computation
Sort
View
HPCA
2008
IEEE
14 years 8 months ago
Supporting highly-decoupled thread-level redundancy for parallel programs
The continued scaling of device dimensions and the operating voltage reduces the critical charge and thus natural noise tolerance level of transistors. As a result, circuits can p...
M. Wasiur Rashid, Michael C. Huang
PPOPP
2010
ACM
14 years 2 months ago
An adaptive performance modeling tool for GPU architectures
This paper presents an analytical model to predict the performance of general-purpose applications on a GPU architecture. The model is designed to provide performance information ...
Sara S. Baghsorkhi, Matthieu Delahaye, Sanjay J. P...
ANCS
2007
ACM
13 years 11 months ago
Towards high-performance flow-level packet processing on multi-core network processors
There is a growing interest in designing high-performance network devices to perform packet processing at flow level. Applications such as stateful access control, deep inspection...
Yaxuan Qi, Bo Xu, Fei He, Baohua Yang, Jianming Yu...
CCGRID
2011
IEEE
12 years 11 months ago
Small Discrete Fourier Transforms on GPUs
– Efficient implementations of the Discrete Fourier Transform (DFT) for GPUs provide good performance with large data sizes, but are not competitive with CPU code for small data ...
S. Mitra, A. Srinivasan
TVLSI
2010
13 years 2 months ago
A Low-Power DSP for Wireless Communications
This paper proposes a low-power high-throughput digital signal processor (DSP) for baseband processing in wireless terminals. It builds on our earlier architecture--Signal processi...
Hyunseok Lee, Chaitali Chakrabarti, Trevor N. Mudg...