Multiple clock cycles are needed to cross the global interconnects for multi-gigahertz designs in nanometer technologies. For synchronous designs, this requires retiming and pipel...
QR decomposition is a computationally intensive linear algebra operation that factors a matrix A into the product of a unitary matrix Q and upper triangular matrix R. Adaptive sys...
IP forwarding is one of the main bottlenecks in Internet backbone routers, as it requires performing the longest-prefix match at 10Gbps speed or higher. IPv6 forwarding further ex...
We describe the design, analysis, and performance of an on–line algorithm to dynamically control the frequency/voltage of a Multiple Clock Domain (MCD) microarchitecture. The MC...
Greg Semeraro, David H. Albonesi, Steve Dropsho, G...
Abstract-- The performance of collective communication operations is known to have a significant impact on the scalability of some applications. Indeed, the global, synchronous nat...
Ron Brightwell, Sue Goudy, Arun Rodrigues, Keith D...