In this paper we show the power of sampling techniques in designing efficient distributed algorithms. In particular, we show that using sampling techniques, on some networks, sele...
Software distributed shared memory (DSM) techniques, while effective on applications with coarse-grained sharing, yield poor performance for the fine-grained sharing encountered i...
We present a fast and scalable matrix multiplication algorithm on distributed memory concurrent computers, whose performance is independent of data distribution on processors, and...
Given an n-degree polynomial fx over an arbitrary ring, the shift of fx by c is the operation which computes coefficients of the polynomial fx + c. In this paper we conside...
In this paper, we propose a new family of interconnection networks, called cyclic networks (CNs), in which an intercluster connection is defined on a set of nodes whose addresses...
In this paper, we propose three different parallel algorithms based on a state-of-the-art global router called TimberWolfSC. The parallel algorithms have been implemented by using...
We present a customizable simulator called netsim for high-performance point-to-point workstation networks that is accurate enough to be used for application-level performance ana...
Mustafa Uysal, Anurag Acharya, Robert Bennett, Joe...
Branch-and-bound algorithms are general methods applicable to various combinatorial optimization problems and parallelization is one of the most hopeful methods to improve these a...