Sciweavers

122 search results - page 22 / 25
» Fast matrix multiplication is stable
Sort
View
FCCM
2007
IEEE
107views VLSI» more  FCCM 2007»
14 years 1 months ago
Optimizing Logarithmic Arithmetic on FPGAs
This paper proposes optimizations of the methods and parameters used in both mathematical approximation and hardware design for logarithmic number system (LNS) arithmetic. First, ...
Haohuan Fu, Oskar Mencer, Wayne Luk
ICMCS
2005
IEEE
133views Multimedia» more  ICMCS 2005»
14 years 1 months ago
Architecture for area-efficient 2-D transform in H.264/AVC
As the VLSI technology advances continuously, ASIC can easily achieve the required performance and most of them are actually over-designed. Thus, architecture shrinking is inevita...
Yu-Ting Kuo, Tay-Jyi Lin, Chih-Wei Liu, Chein-Wei ...
EUROPAR
2009
Springer
14 years 1 days ago
Distributed Data Partitioning for Heterogeneous Processors Based on Partial Estimation of Their Functional Performance Models
The paper presents a new data partitioning algorithm for parallel computing on heterogeneous processors. Like traditional functional partitioning algorithms, the algorithm assumes ...
Alexey L. Lastovetsky, Ravi Reddy
SIAMCOMP
2010
172views more  SIAMCOMP 2010»
13 years 2 months ago
More Algorithms for All-Pairs Shortest Paths in Weighted Graphs
In the first part of the paper, we reexamine the all-pairs shortest paths (APSP) problem and present a new algorithm with running time O(n3 log3 log n/ log2 n), which improves all...
Timothy M. Chan
IPPS
2006
IEEE
14 years 1 months ago
A study of the on-chip interconnection network for the IBM Cyclops64 multi-core architecture
The designs of high-performance processor architectures are moving toward the integration of a large number of multiple processing cores on a single chip. The IBM Cyclops-64 (C64)...
Yingping Zhang, Taikyeong Jeong, Fei Chen, Haiping...