Detecting data races in parallel programs is important for both software development and production-run diagnosis. Recently, there have been several proposals for hardware-assiste...
In LAPACK many matrix operations are cast as block algorithms which iteratively process a panel using an unblocked algorithm and then update a remainder matrix using the high perf...
Memory size reduction and memory accesses optimization are crucial issues for embedded systems. In the context of affine programs, these two challenges are classically tackled by ...
Reconfigurable Processors utilize a reconfigurable fabric (to implement application-specific accelerators) and may perform runtime reconfigurations to exchange the set of deployed...
Applications need to become more concurrent to take advantage of the increased computational power provided by chip level multiprocessing. Programmers have traditionally managed t...
Bratin Saha, Ali-Reza Adl-Tabatabai, Richard L. Hu...