Sciweavers

955 search results - page 5 / 191
» Performance optimization of multiple memory architectures fo...
Sort
View
CASES
2003
ACM
14 years 29 days ago
Vectorizing for a SIMdD DSP architecture
The Single Instruction Multiple Data (SIMD) model for fine-grained parallelism was recently extended to support SIMD operations on disjoint vector elements. In this paper we demon...
Dorit Naishlos, Marina Biberstein, Shay Ben-David,...
DAC
1995
ACM
13 years 11 months ago
A Transformation-Based Approach for Storage Optimization
High-level synthesis (HLS) has been successfully targeted towards the digital signal processing (DSP) domain. Both application-speci c integrated circuits (ASICs) and application-...
Wei-Kai Cheng, Youn-Long Lin
EUROPAR
2010
Springer
13 years 8 months ago
Optimized Dense Matrix Multiplication on a Many-Core Architecture
Abstract. Traditional parallel programming methodologies for improving performance assume cache-based parallel systems. However, new architectures, like the IBM Cyclops-64 (C64), b...
Elkin Garcia, Ioannis E. Venetis, Rishi Khan, Guan...
VLSISP
2008
100views more  VLSISP 2008»
13 years 7 months ago
Memory-constrained Block Processing for DSP Software Optimization
Digital signal processing (DSP) applications involve processing long streams of input data. It is important to take into account this form of processing when implementing embedded ...
Ming-Yung Ko, Chung-Ching Shen, Shuvra S. Bhattach...
DAC
1996
ACM
13 years 12 months ago
Using Register-Transfer Paths in Code Generation for Heterogeneous Memory-Register Architectures
In this paper we address the problem of code generation for basic blocks in heterogeneous memory-register DSP processors. We propose a new a technique, based on register-transfer ...
Guido Araujo, Sharad Malik, Mike Tien-Chien Lee