We analyze circuits for a number of kernels from popular quantum computing applications, characterizing the hardware resources necessary to take ancilla preparation off the critic...
Nemanja Isailovic, Mark Whitney, Yatish Patel, Joh...
Three-dimensional integration enables stacking memory directly on top of a microprocessor, thereby significantly reducing wire delay between the two. Previous studies have examin...
Future chip multiprocessors (CMPs) may have hundreds to thousands of threads competing to access shared resources, and will require quality-of-service (QoS) support to improve sys...
A high-concurrency transactional memory (TM) implementation needs to track concurrent accesses, buffer speculative updates, and manage conflicts. We present a system, FlexTM (FLE...
Arrvindh Shriraman, Sandhya Dwarkadas, Michael L. ...
We demonstrate how fine-grained memory protection can be used in support of transactional memory systems: first showing how a software transactional memory system (STM) can be m...
Flash is a widely used storage device that provides high density and low power, appealing properties for general purpose computing. Today, its usual application is in portable spe...
Process variations are poised to significantly degrade performance benefits sought by moving to the next nanoscale technology node. Parameter fluctuations in devices can introd...
In a chip-multiprocessor (CMP) system, the DRAM system is shared among cores. In a shared DRAM system, requests from a thread can not only delay requests from other threads by cau...
Current state-of-the-art on-chip networks provide efficiency, high throughput, and low latency for one-to-one (unicast) traffic. The presence of one-to-many (multicast) or one-t...
Natalie D. Enright Jerger, Li-Shiuan Peh, Mikko H....