Abstract. We describe how the ASSIST parallel programming environment can be used to run parallel programs on collections of heterogeneous workstations and evaluate the scalability...
Marco Aldinucci, Sonia Campa, Massimo Coppola, Sil...
Memory latency tolerant architectures support thousands of in-flight instructions without scaling cyclecritical processor resources, and thousands of useful instructions can compl...
Amit Gandhi, Haitham Akkary, Ravi Rajwar, Srikanth...
As the number of cores and threads in manycore compute accelerators such as Graphics Processing Units (GPU) increases, so does the importance of on-chip interconnection network des...
In this paper, we explore the use of hierarchically structured multiprocessor tasks (M-tasks) for programming multi-core cluster systems. These systems often have hierarchically s...
In a hardware transactional memory system with lazy versioning and lazy conflict detection, the process of transaction commit can emerge as a bottleneck. This is especially true ...
Seth H. Pugsley, Manu Awasthi, Niti Madan, Naveen ...