Sciweavers

643 search results - page 104 / 129
» Using Hardware Counters to Automatically Improve Memory Perf...
Sort
View
ICPP
2000
IEEE
14 years 1 months ago
Match Virtual Machine: An Adaptive Runtime System to Execute MATLAB in Parallel
MATLAB is one of the most popular languages for desktop numerical computations as well as for signal and image processing applic ations. Applying parallel processing techniques to...
Malay Haldar, Anshuman Nayak, Abhay Kanhere, Pramo...
MICRO
2010
IEEE
175views Hardware» more  MICRO 2010»
13 years 6 months ago
Efficient Selection of Vector Instructions Using Dynamic Programming
Accelerating program performance via SIMD vector units is very common in modern processors, as evidenced by the use of SSE, MMX, VSE, and VSX SIMD instructions in multimedia, scien...
Rajkishore Barik, Jisheng Zhao, Vivek Sarkar
IWMM
2009
Springer
152views Hardware» more  IWMM 2009»
14 years 3 months ago
A new approach to parallelising tracing algorithms
Tracing algorithms visit reachable nodes in a graph and are central to activities such as garbage collection, marshalling etc. Traditional sequential algorithms use a worklist, re...
Cosmin E. Oancea, Alan Mycroft, Stephen M. Watt
ISCAS
2006
IEEE
142views Hardware» more  ISCAS 2006»
14 years 2 months ago
An efficient texture cache for programmable vertex shaders
Vertex texturing is state-of-the-art functionality of vertex. Thus, traditional texture caches used in RE are not the 3D geometry processor. However, it aggravates the always appli...
Seunghyun Cho, Chang-Hyo Yu, Lee-Sup Kim
ASPLOS
2011
ACM
13 years 13 days ago
Inter-core prefetching for multicore processors using migrating helper threads
Multicore processors have become ubiquitous in today’s systems, but exploiting the parallelism they offer remains difficult, especially for legacy application and applications ...
Md Kamruzzaman, Steven Swanson, Dean M. Tullsen