Sciweavers

1095 search results - page 36 / 219
» Measuring the Performance of Parallel Message-Based Process ...
Sort
View
ICASSP
2011
IEEE
12 years 11 months ago
A high throughput parallel AVC/H.264 context-based adaptive binary arithmetic decoder
In this paper, based on the proposed parallelization scheme of binary arithmetic decoding, a parallel AVC/H.264 context-based adaptive binary arithmetic coding (CABAC) decoder wit...
Jia-Wei Liang, He-Yuan Lin, Gwo Giun Lee
EUROPAR
2011
Springer
12 years 7 months ago
A Fully Empirical Autotuned Dense QR Factorization for Multicore Architectures
: Tuning numerical libraries has become more difficult over time, as systems get more sophisticated. In particular, modern multicore machines make the behaviour of algorithms hard ...
Emmanuel Agullo, Jack Dongarra, Rajib Nath, Stanim...
HOTI
2005
IEEE
14 years 1 months ago
Hybrid Cache Architecture for High Speed Packet Processing
: The exposed memory hierarchies employed in many network processors (NPs) are expensive in terms of meeting the worst-case processing requirement. Moreover, it is difficult to ef...
Zhen Liu, Kai Zheng, Bin Liu
PPOPP
2010
ACM
14 years 5 months ago
Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs?
Most modern Chip Multiprocessors (CMP) feature shared cache on chip. For multithreaded applications, the sharing reduces communication latency among co-running threads, but also r...
Eddy Z. Zhang, Xipeng Shen, Yunlian Jiang
FPL
2008
Springer
124views Hardware» more  FPL 2008»
13 years 9 months ago
Direct sigma-delta modulated signal processing in FPGA
The effectiveness of implementing bit-stream signal processing (BSSP) multiplier circuits in FPGAs, in terms of hardware resources and clock frequency, is presented. In particular...
Chiu-Wah Ng, Ngai Wong, Hayden Kwok-Hay So, Tung-S...