Matrix multiplication is an important kernel in linear algebra algorithms, and the performance of both serial and parallel implementations is highly dependent on the memory system...
Siddhartha Chatterjee, Alvin R. Lebeck, Praveen K....
The new video coding standard, H.264 allows motion estimation performing on tree-structured block partitioning and multiple reference frames. This feature improves the prediction ...
To truly exploit FPGAs for rapid turn-around development and prototyping, placement times must be reduced to seconds; latebound, reconfigurable computing applications may demand p...
Large-scale logistic regression arises in many applications such as document classification and natural language processing. In this paper, we apply a trust region Newton method t...
The problem of choosing fast implementations for a class of recursive algorithms such as the fast Fourier transforms can be formulated as an optimization problem over the language...