Sciweavers

5424 search results - page 956 / 1085
» Parallel Computing with FPGAs - Concepts and Applications
Sort
View
100
Voted
EUROPAR
2009
Springer
15 years 8 months ago
Detailed Performance Analysis Using Coarse Grain Sampling
Performance evaluation tools enable analysts to shed light on how applications behave both from a general point of view and at concrete execution points, but cannot provide detaile...
Harald Servat, Germán Llort, Judit Gimenez,...
130
Voted
IWOMP
2009
Springer
15 years 8 months ago
Dynamic Task and Data Placement over NUMA Architectures: An OpenMP Runtime Perspective
Abstract. Exploiting the full computational power of current hierarchical multiprocessor machines requires a very careful distribution of threads and data among the underlying non-...
François Broquedis, Nathalie Furmento, Bric...
ASPLOS
2008
ACM
15 years 5 months ago
SoftSig: software-exposed hardware signatures for code analysis and optimization
Many code analysis techniques for optimization, debugging, or parallelization need to perform runtime disambiguation of sets of addresses. Such operations can be supported efficie...
James Tuck, Wonsun Ahn, Luis Ceze, Josep Torrellas
CAL
2007
15 years 3 months ago
Low-Cost Microarchitectural Support for Improved Floating-Point Accuracy
Abstract—Some processors designed for consumer applications, such as Graphics Processing Units (GPUs) and the CELL processor, promise outstanding floating-point performance for ...
William R. Dieter, A. Kaveti, Henry G. Dietz
153
Voted
MOC
2002
144views more  MOC 2002»
15 years 3 months ago
Convergence rate analysis of an asynchronous space decomposition method for convex Minimization
Abstract. We analyze the convergence rate of an asynchronous space decomposition method for constrained convex minimization in a reflexive Banach space. This method includes as spe...
Xue-Cheng Tai, Paul Tseng