This paper proposes and evaluates hardware mechanisms for supporting prescient instruction prefetch--an approach to improving single-threaded application performance by using help...
Tor M. Aamodt, Paul Chow, Per Hammarlund, Hong Wan...
Conventional instruction fetch mechanisms fetch contiguous blocks of instructions in each cycle. They are difficult to scale since taken branches make it hard to increase the siz...
Parallel bit stream algorithms exploit the SWAR (SIMD within a register) capabilities of commodity processors in high-performance text processing applications such as UTF8 to UTF-...
Binary finite fields GF(2n ) are very commonly used in cryptography, particularly in publickey algorithms such as Elliptic Curve Cryptography (ECC). On word-oriented programmable ...
The throughput of a multiple-pipelined processor suffers due to lack of sufficient instructions to make multiple pipelines busy and due to delays associated with pipeline depende...