A blossoming paradigm for block-recursive matrix algorithms is presented that, at once, attains excellent performance measured by • time, • TLB misses, • L1 misses, • L2 m...
The problem of computing periods in words, or finite sequences of symbols from a finite alphabet, has important applications in several areas including data compression, string se...
This paper presents a system for rapid editing of highly dynamic motion capture data. At the heart of this system is an optimization algorithm that can transform the captured moti...
In the framework of perfect loop nests with uniform dependences, tiling has been extensively studied as a source-to-source program transformation. Little work has been devoted to ...
Data caches are essential in modern processors, bridging the widening gap between main memory and processor speeds. However, they yield very complex performance models, which make...