Software implementations of modern block ciphers often require large lookup tables along with code size increasing optimizations like loop unrolling to reach peak performance on g...
On a distributed memory machine, hand-coded message passing leads to the most efficient execution, but it is difficult to use. Parallelizing compilers can approach the performance...
Thermal management of DRAM memory has become a critical issue for server systems. We have done, to our best knowledge, the first study of software thermal management for memory su...
Multiprocessors are now commonplace, and cloud computing is swiftly following suit. While it is possible to write high performance code for these systems, concurrency bugs are ext...
Multicore designs have emerged as the mainstream design paradigm for the microprocessor industry. Unfortunately, providing multiple cores does not directly translate into performa...
Mojtaba Mehrara, Jeff Hao, Po-Chun Hsu, Scott A. M...