Sciweavers

1224 search results - page 96 / 245
» Design and Implementation of a Practical Parallel Delaunay A...
Sort
View
GRID
2007
Springer
15 years 10 months ago
Grid-based asynchronous replica exchange
— Replica exchange is a powerful sampling algorithm and can be effectively used for applications such as simulating the structure, function, folding, and dynamics of proteins and...
Zhen Li, Manish Parashar
PVM
2005
Springer
15 years 9 months ago
Scalable Fault Tolerant MPI: Extending the Recovery Algorithm
ct Fault Tolerant MPI (FT-MPI)[6] was designed as a solution to allow applications different methods to handle process failures beyond simple check-point restart schemes. The init...
Graham E. Fagg, Thara Angskun, George Bosilca, Jel...
ASAP
2007
IEEE
150views Hardware» more  ASAP 2007»
15 years 8 months ago
Customizing Reconfigurable On-Chip Crossbar Scheduler
We present a design of a customized crossbar scheduler for on-chip networks. The proposed scheduler arbitrates on-demand interconnects, where physical topologies are identical to ...
Jae Young Hur, Todor Stefanov, Stephan Wong, Stama...
PPOPP
2010
ACM
16 years 1 months ago
Scaling LAPACK panel operations using parallel cache assignment
In LAPACK many matrix operations are cast as block algorithms which iteratively process a panel using an unblocked algorithm and then update a remainder matrix using the high perf...
Anthony M. Castaldo, R. Clint Whaley
ICFP
2012
ACM
13 years 6 months ago
Nested data-parallelism on the gpu
Graphics processing units (GPUs) provide both memory bandwidth and arithmetic performance far greater than that available on CPUs but, because of their Single-Instruction-Multiple...
Lars Bergstrom, John H. Reppy