Sciweavers

280 search results - page 7 / 56
» Challenges in exploitation of loop parallelism in embedded a...
Sort
View
CASES
2009
ACM
14 years 13 days ago
Exploiting residue number system for power-efficient digital signal processing in embedded processors
2's complement number system imposes a fundamental limitation on the power and performance of arithmetic circuits, due to the fundamental need of cross-datapath carry propaga...
Rooju Chokshi, Krzysztof S. Berezowski, Aviral Shr...
SAMOS
2004
Springer
14 years 1 months ago
Modeling Loop Unrolling: Approaches and Open Issues
Abstract. Loop unrolling plays an important role in compilation for Reconfigurable Processing Units (RPUs) as it exposes operator parallelism and enables other transformations (e.g...
João M. P. Cardoso, Pedro C. Diniz
JISE
2002
165views more  JISE 2002»
13 years 8 months ago
Locality-Preserving Dynamic Load Balancing for Data-Parallel Applications on Distributed-Memory Multiprocessors
Load balancing and data locality are the two most important factors in the performance of parallel programs on distributed-memory multiprocessors. A good balancing scheme should e...
Pangfeng Liu, Jan-Jan Wu, Chih-Hsuae Yang
PPOPP
2010
ACM
14 years 5 months ago
Data transformations enabling loop vectorization on multithreaded data parallel architectures
Loop vectorization, a key feature exploited to obtain high performance on Single Instruction Multiple Data (SIMD) vector architectures, is significantly hindered by irregular memo...
Byunghyun Jang, Perhaad Mistry, Dana Schaa, Rodrig...
ICPP
2003
IEEE
14 years 1 months ago
Procedural Level Address Offset Assignment of DSP Applications with Loops
Automatic optimization of address offset assignment for DSP applications, which reduces the number of address arithmetic instructions to meet the tight memory size restrictions an...
Youtao Zhang, Jun Yang 0002