Sciweavers

94 search results - page 14 / 19
» Multiple-banked register file architectures
Sort
View
IPPS
2010
IEEE
13 years 5 months ago
A GPU-inspired soft processor for high-throughput acceleration
There is building interest in using FPGAs as accelerators for high-performance computing, but existing systems for programming them are so far inadequate. In this paper we propose...
Jeffrey Kingyens, J. Gregory Steffan
ISPD
2003
ACM
132views Hardware» more  ISPD 2003»
14 years 1 months ago
Architecture and synthesis for multi-cycle communication
For multi-gigahertz designs in nanometer technologies, data transfers on global interconnects take multiple clock cycles. In this paper, we propose a regular distributed register ...
Jason Cong, Yiping Fan, Xun Yang, Zhiru Zhang
EUROPAR
2010
Springer
13 years 8 months ago
Optimized Dense Matrix Multiplication on a Many-Core Architecture
Abstract. Traditional parallel programming methodologies for improving performance assume cache-based parallel systems. However, new architectures, like the IBM Cyclops-64 (C64), b...
Elkin Garcia, Ioannis E. Venetis, Rishi Khan, Guan...
VLSID
2004
IEEE
107views VLSI» more  VLSID 2004»
14 years 8 months ago
Performance Analysis of Inter Cluster Communication Methods in VLIW Architecture
With increasing demands for high performance by embedded systems, especially by digital signal processing applications, embedded processors must increase available instruction lev...
Sourabh Saluja, Anshul Kumar
ICCD
2003
IEEE
147views Hardware» more  ICCD 2003»
14 years 4 months ago
An Efficient VLIW DSP Architecture for Baseband Processing
The VLIW processors with static instruction scheduling and thus deterministic execution times are very suitable for highperformance real-time DSP applications. But the two major w...
Tay-Jyi Lin, Chin-Chi Chang, Chen-Chia Lee, Chein-...