Sciweavers

141 search results - page 18 / 29
» Load Execution Latency Reduction
Sort
View
CGO
2009
IEEE
14 years 3 months ago
Reducing Memory Ordering Overheads in Software Transactional Memory
—Most research into high-performance software transactional memory (STM) assumes that transactions will run on a processor with a relatively strict memory model, such as Total St...
Michael F. Spear, Maged M. Michael, Michael L. Sco...
GRID
2005
Springer
14 years 2 months ago
Collective operations for wide-area message passing systems using adaptive spanning trees
Abstract— We propose a method for wide-area message passing systems to perform collective operations using dynamically created spanning trees. In our proposal, broadcasts and red...
Hideo Saito, Kenjiro Taura, Takashi Chikayama
ISCA
2011
IEEE
229views Hardware» more  ISCA 2011»
13 years 9 days ago
TLSync: support for multiple fast barriers using on-chip transmission lines
As the number of cores on a single-chip grows, scalable barrier synchronization becomes increasingly difficult to implement. In software implementations, such as the tournament ba...
Jungju Oh, Milos Prvulovic, Alenka G. Zajic
ASAP
2002
IEEE
105views Hardware» more  ASAP 2002»
14 years 1 months ago
Implications of Programmable General Purpose Processors for Compression/Encryption Applications
With the growth of the Internet and mobile communication industry, multimedia applications form a dominant computer workload. Media workloads are typically executed on Application...
Byeong Kil Lee, Lizy Kurian John
ICDE
2003
IEEE
144views Database» more  ICDE 2003»
14 years 10 months ago
Flux: An Adaptive Partitioning Operator for Continuous Query Systems
The long-running nature of continuous queries poses new scalability challenges for dataflow processing. CQ systems execute pipelined dataflows that may be shared across multiple q...
Mehul A. Shah, Joseph M. Hellerstein, Sirish Chand...