Interprocessor synchronization, while extremely important for ensuring execution correctness, can be very costly in terms of both power and performance overheads. Unfortunately, m...
Transactional memory is being advanced as an alternative to traditional lock-based synchronization for concurrent programming. Transactional memory simplifies the programming mode...
Takayuki Usui, Reimer Behrends, Jacob Evans, Yanni...
Systolic implementations of dynamic programming solutions that utilize a similarity matrix can achieve appreciable performance with both course- and fine-grain parallelization. A ...
We present a novel design and implementation of relational join algorithms for new-generation graphics processing units (GPUs). The most recent GPU features include support for wr...
Bingsheng He, Ke Yang, Rui Fang, Mian Lu, Naga K. ...
The Virtual Interface (VI) architecture has become the industry standard for user-level network interfaces. This paper presents the implementation and evaluation of Javia, a Java ...