Abstract. We introduce a collection of high performance kernels for basic linear algebra. The kernels encapsulate small xed size computations in order to provide building blocks for numerical libraries in C++. The sizes are templated parameters of the kernels, so they can be easily congured to a speci c architecture for portability. In this way the BLAIS delivers the power of such code generation systems as PHiPAC 1 and ATLAS 8 . BLAIS has a simple and elegant interface, so that one can write exible-sized block algorithms without the complications of a code generation system. The BLAIS are implemented on the Fixed Algorithm Size Template FAST Library which we also introduce in this paper. The FAST routines provide equivalent functionality to the algorithms in the Standard Template Library 7 , but are tailored speci cally for high performance kernels.
Jeremy G. Siek, Andrew Lumsdaine