Sciweavers

720 search results - page 79 / 144
» Uniform Memory Hierarchies
Sort
View
COMPGEOM
2005
ACM
15 years 6 months ago
Cache-oblivious r-trees
We develop a cache-oblivious data structure for storing a set S of N axis-aligned rectangles in the plane, such that all rectangles in S intersecting a query rectangle or point ca...
Lars Arge, Mark de Berg, Herman J. Haverkort
ICS
2007
Tsinghua U.
15 years 10 months ago
Adaptive Strassen's matrix multiplication
Strassen’s matrix multiplication (MM) has benefits with respect to any (highly tuned) implementations of MM because Strassen’s reduces the total number of operations. Strasse...
Paolo D'Alberto, Alexandru Nicolau
CF
2007
ACM
15 years 8 months ago
An analysis of the effects of miss clustering on the cost of a cache miss
In this paper we describe a new technique, called pipeline spectroscopy, and use it to measure the cost of each cache miss. The cost of a miss is displayed (graphed) as a histogra...
Thomas R. Puzak, Allan Hartstein, Philip G. Emma, ...
175
Voted
PLDI
2012
ACM
13 years 7 months ago
Adaptive input-aware compilation for graphics engines
While graphics processing units (GPUs) provide low-cost and efficient platforms for accelerating high performance computations, the tedious process of performance tuning required...
Mehrzad Samadi, Amir Hormati, Mojtaba Mehrara, Jan...
PVM
2010
Springer
15 years 3 months ago
Adaptive MPI Multirail Tuning for Non-uniform Input/Output Access
Multicore processors have not only reintroduced Non-Uniform Memory Access (NUMA) architectures in nowadays parallel computers, but they are also responsible for non-uniform access ...
Stephanie Moreaud, Brice Goglin, Raymond Namyst