Sciweavers

643 search results - page 87 / 129
» Using Hardware Counters to Automatically Improve Memory Perf...
Sort
View
IWMM
2009
Springer
130views Hardware» more  IWMM 2009»
14 years 3 months ago
A component model of spatial locality
Good spatial locality alleviates both the latency and bandwidth problem of memory by boosting the effect of prefetching and improving the utilization of cache. However, convention...
Xiaoming Gu, Ian Christopher, Tongxin Bai, Chengli...
TASLP
2008
120views more  TASLP 2008»
13 years 8 months ago
Rapid Speaker Adaptation Using Clustered Maximum-Likelihood Linear Basis With Sparse Training Data
Abstract-- Speaker space based adaptation methods for automatic speech recognition have been shown to provide significant performance improvements for tasks where only a few second...
Yun Tang, Richard Rose
PLDI
2004
ACM
14 years 2 months ago
Vectorization for SIMD architectures with alignment constraints
When vectorizing for SIMD architectures that are commonly employed by today’s multimedia extensions, one of the new challenges that arise is the handling of memory alignment. Pr...
Alexandre E. Eichenberger, Peng Wu, Kevin O'Brien
ISPASS
2009
IEEE
14 years 3 months ago
Machine learning based online performance prediction for runtime parallelization and task scheduling
—With the emerging many-core paradigm, parallel programming must extend beyond its traditional realm of scientific applications. Converting existing sequential applications as w...
Jiangtian Li, Xiaosong Ma, Karan Singh, Martin Sch...
IWMM
2010
Springer
118views Hardware» more  IWMM 2010»
14 years 1 months ago
Speculative parallelization using state separation and multiple value prediction
With the availability of chip multiprocessor (CMP) and simultaneous multithreading (SMT) machines, extracting thread level parallelism from a sequential program has become crucial...
Chen Tian, Min Feng, Rajiv Gupta