Data locality is critical to achievinghigh performance on large-scale parallel machines. Non-local data accesses result in communication that can greatly impact performance. Thus ...
Concurrent Collections (CnC)[8] is a declarative parallel language that allows the application developer to express their parallel application as a collection of high-level comput...
This paper presents a new compiler optimization algorithm that parallelizes applications for symmetric, sharedmemory multiprocessors. The algorithm considers data locality, parall...
In this paper, we deal with broadcasting on heterogeneous platforms. Typically, the message to be broadcast is split into several slices, which are sent by the source processor in...