Sciweavers

1614 search results - page 129 / 323
» A Customizable Implementation of RMI for High Performance Co...
Sort
View
CODES
2009
IEEE
13 years 9 months ago
A scalable parallel H.264 decoder on the cell broadband engine architecture
The H.264 video codec provides exceptional video compression while imposing dramatic increases in computational complexity over previous standards. While exploiting parallelism in...
Michael A. Baker, Pravin Dalale, Karam S. Chatha, ...
OPODIS
2004
13 years 9 months ago
Lock-Free and Practical Doubly Linked List-Based Deques Using Single-Word Compare-and-Swap
Abstract. We present an efficient and practical lock-free implementation of a concurrent deque that supports parallelism for disjoint accesses and uses atomic primitives which are ...
Håkan Sundell, Philippas Tsigas
DAC
1999
ACM
14 years 15 days ago
Automated Phase Assignment for the Synthesis of Low Power Domino Circuits
High performance circuit techniques such as domino logic have migrated from the microprocessor world into more mainstream ASIC designs. The problem is that domino logic comes at a...
Priyadarshan Patra, Unni Narayanan
SIGMOD
2008
ACM
140views Database» more  SIGMOD 2008»
14 years 8 months ago
Relational joins on graphics processors
We present a novel design and implementation of relational join algorithms for new-generation graphics processing units (GPUs). The most recent GPU features include support for wr...
Bingsheng He, Ke Yang, Rui Fang, Mian Lu, Naga K. ...
FCCM
2006
IEEE
107views VLSI» more  FCCM 2006»
14 years 2 months ago
Hardware/Software Integration for FPGA-based All-Pairs Shortest-Paths
Field-Programmable Gate Arrays (FPGAs) are being employed in high performance computing systems owing to their potential to accelerate a wide variety of long-running routines. Par...
Uday Bondhugula, Ananth Devulapalli, James Dinan, ...