Sciweavers

468 search results - page 66 / 94
» A compiler for high performance computing with many-core acc...
Sort
View
PPOPP
2009
ACM
14 years 8 months ago
Mapping parallelism to multi-cores: a machine learning based approach
The efficient mapping of program parallelism to multi-core processors is highly dependent on the underlying architecture. This paper proposes a portable and automatic compiler-bas...
Zheng Wang, Michael F. P. O'Boyle
CCGRID
2001
IEEE
13 years 11 months ago
Parallel I/O Support for HPF on Clusters
Clusters of workstations are a popular alternative to integrated parallel systems designed and built by a vendor. Besides their huge cumulative processing power, they also provide...
Peter Brezany, Viera Sipková
CASES
2007
ACM
13 years 11 months ago
A fast and generic hybrid simulation approach using C virtual machine
Instruction Set Simulators (ISSes) are important tools for cross-platform software development. The simulation speed is a major concern and many approaches have been proposed to i...
Lei Gao, Stefan Kraemer, Rainer Leupers, Gerd Asch...
ASPLOS
2009
ACM
14 years 8 months ago
QR decomposition on GPUs
QR decomposition is a computationally intensive linear algebra operation that factors a matrix A into the product of a unitary matrix Q and upper triangular matrix R. Adaptive sys...
Andrew Kerr, Dan Campbell, Mark Richards
GLVLSI
2007
IEEE
162views VLSI» more  GLVLSI 2007»
13 years 11 months ago
Utilizing custom registers in application-specific instruction set processors for register spills elimination
Application-specific instruction set processor (ASIP) has become an important design choice for embedded systems. It can achieve both high flexibility offered by the base processo...
Hai Lin, Yunsi Fei