Sciweavers

ISCA
2012
IEEE
208views Hardware» more  ISCA 2012»
12 years 3 months ago
Harmony: Collection and analysis of parallel block vectors
Efficient execution of well-parallelized applications is central to performance in the multicore era. Program analysis tools support the hardware and software sides of this effor...
Melanie Kambadur, Kui Tang, Martha A. Kim
ISCA
2012
IEEE
232views Hardware» more  ISCA 2012»
12 years 3 months ago
RADISH: Always-on sound and complete race detection in software and hardware
Data-race freedom is a valuable safety property for multithreaded programs that helps with catching bugs, simplifying memory consistency model semantics, and verifying and enforci...
Joseph Devietti, Benjamin P. Wood, Karin Strauss, ...
ISCA
2012
IEEE
243views Hardware» more  ISCA 2012»
12 years 3 months ago
BlockChop: Dynamic squash elimination for hybrid processor architecture
Hybrid processors are HW/SW co-designed processors that leverage blocked-execution, the execution of regions of instructions as atomic blocks, to facilitate aggressive speculative...
Jason Mars, Naveen Kumar
ISCA
2012
IEEE
281views Hardware» more  ISCA 2012»
12 years 3 months ago
LOT-ECC: Localized and tiered reliability mechanisms for commodity memory systems
Memory system reliability is a serious and growing concern in modern servers. Existing chipkill-level memory protection mechanisms suffer from several drawbacks. They activate a l...
Aniruddha N. Udipi, Naveen Muralimanohar, Rajeev B...
ISCA
2012
IEEE
262views Hardware» more  ISCA 2012»
12 years 3 months ago
Boosting mobile GPU performance with a decoupled access/execute fragment processor
Smartphones represent one of the fastest growing markets, providing significant hardware/software improvements every few months. However, supporting these capabilities reduces the...
Jose-Maria Arnau, Joan-Manuel Parcerisa, Polychron...
ISCA
2012
IEEE
333views Hardware» more  ISCA 2012»
12 years 3 months ago
Reducing memory reference energy with opportunistic virtual caching
Most modern cores perform a highly-associative translation look aside buffer (TLB) lookup on every memory access. These designs often hide the TLB lookup latency by overlapping it...
Arkaprava Basu, Mark D. Hill, Michael M. Swift
ISCA
2012
IEEE
279views Hardware» more  ISCA 2012»
12 years 3 months ago
Staged memory scheduling: Achieving high performance and scalability in heterogeneous systems
When multiple processor (CPU) cores and a GPU integrated together on the same chip share the off-chip main memory, requests from the GPU can heavily interfere with requests from t...
Rachata Ausavarungnirun, Kevin Kai-Wei Chang, Lava...
ISCA
2012
IEEE
260views Hardware» more  ISCA 2012»
12 years 3 months ago
A case for exploiting subarray-level parallelism (SALP) in DRAM
Modern DRAMs have multiple banks to serve multiple memory requests in parallel. However, when two requests go to the same bank, they have to be served serially, exacerbating the h...
Yoongu Kim, Vivek Seshadri, Donghyuk Lee, Jamie Li...
ISCA
2012
IEEE
224views Hardware» more  ISCA 2012»
12 years 3 months ago
A first-order mechanistic model for architectural vulnerability factor
Soft error reliability has become a first-order design criterion for modern microprocessors. Architectural Vulnerability Factor (AVF) modeling is often used to capture the probab...
Arun A. Nair, Stijn Eyerman, Lieven Eeckhout, Lizy...
ISCA
2012
IEEE
218views Hardware» more  ISCA 2012»
12 years 3 months ago
Towards energy-proportional datacenter memory with mobile DRAM
To increase datacenter energy efficiency, we need memory systems that keep pace with processor efficiency gains. Currently, servers use DDR3 memory, which is designed for high b...
Krishna T. Malladi, Frank A. Nothaft, Karthika Per...