Even though graphics processors (GPUs) are becoming increasingly popular for general purpose computing, current (and likely near future) generations of GPUs do not provide hardwar...
Abstract: Modeling the interactions of biological molecules, or docking, is critical to both understanding basic life processes and to designing new drugs. Here we describe the GPU...
While deterministic replay of parallel programs is a powerful technique, current proposals have shortcomings. Specifically, software-based replay systems have high overheads on mu...
Pablo Montesinos, Matthew Hicks, Samuel T. King, J...
Parallel bit stream algorithms exploit the SWAR (SIMD within a register) capabilities of commodity processors in high-performance text processing applications such as UTF8 to UTF-...
We report on our experience with the hardware transactional memory (HTM) feature of two pre-production revisions of a new commercial multicore processor. Our experience includes a...
Multicore hardware is making concurrent programs pervasive. Unfortunately, concurrent programs are prone to bugs. Among different types of concurrency bugs, atomicity violation bu...
The wide availability of commodity graphics processors has made real-time graphics an intrinsic component of the human/computer interface. These graphics cores accelerate the z-bu...
Software failures in server applications are a significant problem for preserving system availability. We present ASSURE, a system that introduces rescue points that recover softw...
Stelios Sidiroglou, Oren Laadan, Carlos Perez, Nic...
Extracting performance from many-core architectures requires software engineers to create multi-threaded applications, which significantly complicates the already daunting task of...
Modern processors use branch target buffers (BTBs) to predict the target address of branches such that they can fetch ahead in the instruction stream increasing concurrency and pe...