Performance evaluation tools enable analysts to shed light on how applications behave both from a general point of view and at concrete execution points, but cannot provide detaile...
Abstract. Exploiting the full computational power of current hierarchical multiprocessor machines requires a very careful distribution of threads and data among the underlying non-...
Many code analysis techniques for optimization, debugging, or parallelization need to perform runtime disambiguation of sets of addresses. Such operations can be supported efficie...
James Tuck, Wonsun Ahn, Luis Ceze, Josep Torrellas
Abstract—Some processors designed for consumer applications, such as Graphics Processing Units (GPUs) and the CELL processor, promise outstanding floating-point performance for ...
Abstract. We analyze the convergence rate of an asynchronous space decomposition method for constrained convex minimization in a reflexive Banach space. This method includes as spe...