This site uses cookies to deliver our services and to ensure you get the best experience. By continuing to use this site, you consent to our use of cookies and acknowledge that you have read and understand our Privacy Policy, Cookie Policy, and Terms
Abstract. Themain contributionof thiswork isto propose a numberof broadcastefficient VLSI architectures for computing the sum and the prefix sums of a w k-bit, k 2, binary sequenc...
Rong Lin, Koji Nakano, Stephan Olariu, Maria Crist...
Parallelizing compiler technology has improved in recent years. One area in which compilers have made progress is in handling DOACROSS loops, where crossprocessor data dependencie...
Instance based locality optimization 6 is a semi automatic program restructuring method that reduces the number of cache misses. The method imitates the human approach of consideri...
Abstract. The combination of a language with ne-grain implicit parallelism and a data
ow evaluation scheme is suitable for high-level programming on massively parallel architectur...
We investigate the average-case speed and scalability of parallel algorithms executing on multiprocessors. Our performance metrics are average-speed and isospeed scalability. By m...
The known fast sequential algorithms for multiplying two N N matrices (over an arbitrary ring) have time complexity ON , where 2 3. The current best value of is less than 2.3755....
Executing subordinate activities by pushing return addresses on the stack is the most e cient working mode for sequential programs. It is supported by all current processors, yet i...
The RAIN (Reliable Array of Independent Nodes) project at Caltech is focusing on creating highly reliable distributed systems by leveraging commercially available personal compute...