Sciweavers

272 search results - page 37 / 55
» Code Transformations to Improve Memory Parallelism
Sort
View
IPPS
2009
IEEE
14 years 3 months ago
High-order stencil computations on multicore clusters
Stencil computation (SC) is of critical importance for broad scientific and engineering applications. However, it is a challenge to optimize complex, highorder SC on emerging clus...
Liu Peng, Richard Seymour, Ken-ichi Nomura, Rajiv ...
CASCON
2008
164views Education» more  CASCON 2008»
13 years 10 months ago
High performance XML parsing using parallel bit stream technology
Parabix (parallel bit streams for XML) is an open-source XML parser that employs the SIMD (single-instruction multiple-data) capabilities of modern-day commodity processors to del...
Robert D. Cameron, Kenneth S. Herdy, Dan Lin
ISHPC
2003
Springer
14 years 1 months ago
Performance Study of a Whole Genome Comparison Tool on a Hyper-Threading Multiprocessor
We developed a multithreaded parallel implementation of a sequence alignment algorithm that is able to align whole genomes with reliable output and reasonable cost. This paper pres...
Juan del Cuvillo, Xinmin Tian, Guang R. Gao, Milin...
IPPS
2006
IEEE
14 years 2 months ago
A performance model for fine-grain accesses in UPC
UPC’s implicit communication and fine-grain programming style make application performance modeling a challenging task. The correspondence between remote references and communi...
Zhang Zhang, S. R. Seidel
EUROPAR
2005
Springer
14 years 2 months ago
DCT Block Conversion for H.264/AVC Video Transcoding
In H.264/AVC [1], integer transforms are applied instead of the 8×8 discrete cosine transform (DCT) of previous standards to avoid inverse transform mismatch problems. However, th...
Joo-Kyong Lee, Ki-Dong Chung