Parallel processing networks, even full crossbars, that only implement point-to-point and multicast message passing are inefficient for collective communications because multiple ...
Huge energy consumption has become a critical bottleneck for further applying large-scale cluster systems to build new data centers. Among various components of a data center, sto...
Ziliang Zong, Matt Briggs, Nick O'Connor, Xiao Qin
This paper presents a hierarchical parallel MPEG-2 decoder for playing ultra-high-resolution videos on PC cluster based tiled display systems. To maximize parallelism while minimi...
Parallel genetic algorithms (PGAs) have been developed to reduce the large execution times that are associated with serial genetic algorithms (SGAs). They have also been used to s...
Lee Wang, Anthony A. Maciejewski, Howard Jay Siege...
Loop fusion improves data locality and reduces synchronization in data-parallel applications. However, loop fusion is not always legal. Even when legal, fusion may introduce loop-...