Cache hierarchies have been traditionally designed for usage by a single application, thread or core. As multi-threaded (MT) and multi-core (CMP) platform architectures emerge and...
We propose a Parallel Banding Algorithm (PBA) on the GPU to compute the exact Euclidean Distance Transform (EDT) for a binary image in 2D and higher dimensions. Partitioning the i...
Thanh-Tung Cao, Ke Tang, Anis Mohamed, Tiow Seng T...
Abstract--The increasing availability of multi-core and multiprocessor architectures provides new opportunities for improving the performance of many computer simulations. Markov C...
Jonathan M. R. Byrd, Stephen A. Jarvis, Abhir H. B...
Suffix trees are indexing structures that enhance the performance of numerous string processing algorithms. In this paper, we propose cache-conscious suffix tree construction algo...
Linked data structure (LDS) accesses are critical to the performance of many large scale applications. Techniques have been proposed to prefetch such accesses. Unfortunately, many...