Sciweavers

MICRO
2010
IEEE
134views Hardware» more  MICRO 2010»
13 years 9 months ago
Minimal Multi-threading: Finding and Removing Redundant Instructions in Multi-threaded Processors
Guoping Long, Diana Franklin, Susmit Biswas, Pablo...
MICRO
2010
IEEE
119views Hardware» more  MICRO 2010»
13 years 9 months ago
A Predictive Model for Dynamic Microarchitectural Adaptivity Control
Abstract--Adaptive microarchitectures are a promising solution for designing high-performance, power-efficient microprocessors. They offer the ability to tailor computational resou...
Christophe Dubach, Timothy M. Jones, Edwin V. Boni...
MICRO
2010
IEEE
149views Hardware» more  MICRO 2010»
13 years 9 months ago
Improving SIMT Efficiency of Global Rendering Algorithms with Architectural Support for Dynamic Micro-Kernels
Wide Single Instruction, Multiple Thread (SIMT) architectures often require a static allocation of thread groups that are executed in lockstep throughout the entire application ker...
Michael Steffen, Joseph Zambreno
MICRO
2010
IEEE
242views Hardware» more  MICRO 2010»
13 years 9 months ago
ASF: AMD64 Extension for Lock-Free Data Structures and Transactional Memory
Advanced Synchronization Facility (ASF) is an AMD64 hardware extension for lock-free data structures and transactional memory. It provides a speculative region that atomically exec...
Jae-Woong Chung, Luke Yen, Stephan Diestelhorst, M...
MICRO
2010
IEEE
99views Hardware» more  MICRO 2010»
13 years 9 months ago
ScalableBulk: Scalable Cache Coherence for Atomic Blocks in a Lazy Environment
Recently-proposed architectures that continuously operate on atomic blocks of instructions (also called chunks) can boost the programmability and performance of shared-memory mult...
Xuehai Qian, Wonsun Ahn, Josep Torrellas
MICRO
2010
IEEE
167views Hardware» more  MICRO 2010»
13 years 9 months ago
Erasing Core Boundaries for Robust and Configurable Performance
Single-thread performance, reliability and power efficiency are critical design challenges of future multicore systems. Although point solutions have been proposed to address thes...
Shantanu Gupta, Shuguang Feng, Amin Ansari, Scott ...
MICRO
2010
IEEE
270views Hardware» more  MICRO 2010»
13 years 9 months ago
Many-Thread Aware Prefetching Mechanisms for GPGPU Applications
Abstract-- We consider the problem of how to improve memory latency tolerance in massively multithreaded GPGPUs when the thread-level parallelism of an application is not sufficien...
Jaekyu Lee, Nagesh B. Lakshminarayana, Hyesoon Kim...
MICRO
2010
IEEE
175views Hardware» more  MICRO 2010»
13 years 9 months ago
Efficient Selection of Vector Instructions Using Dynamic Programming
Accelerating program performance via SIMD vector units is very common in modern processors, as evidenced by the use of SSE, MMX, VSE, and VSX SIMD instructions in multimedia, scien...
Rajkishore Barik, Jisheng Zhao, Vivek Sarkar
MICRO
2010
IEEE
130views Hardware» more  MICRO 2010»
13 years 9 months ago
Pseudo-Circuit: Accelerating Communication for On-Chip Interconnection Networks
As the number of cores on a single chip increases with more recent technologies, a packet-switched on-chip interconnection network has become a de facto communication paradigm for ...
Minseon Ahn, Eun Jung Kim
MICRO
2010
IEEE
153views Hardware» more  MICRO 2010»
13 years 9 months ago
Scalable Speculative Parallelization on Commodity Clusters
While clusters of commodity servers and switches are the most popular form of large-scale parallel computers, many programs are not easily parallelized for execution upon them. In...
Hanjun Kim, Arun Raman, Feng Liu, Jae W. Lee, Davi...