Sciweavers

604 search results - page 88 / 121
» Advances in design and implementation of optimization softwa...
Sort
View
MICRO
2010
IEEE
215views Hardware» more  MICRO 2010»
13 years 6 months ago
A Task-Centric Memory Model for Scalable Accelerator Architectures
This paper presents a task-centric memory model for 1000-core compute accelerators. Visual computing applications are emerging as an important class of workloads that can exploit ...
John H. Kelm, Daniel R. Johnson, Steven S. Lumetta...
ASPLOS
2011
ACM
12 years 11 months ago
Sponge: portable stream programming on graphics engines
Graphics processing units (GPUs) provide a low cost platform for accelerating high performance computations. The introduction of new programming languages, such as CUDA and OpenCL...
Amir Hormati, Mehrzad Samadi, Mark Woh, Trevor N. ...
DAIS
1997
13 years 9 months ago
A System for Specifying and Coordinating the Execution of Reliable Distributed Applications
An increasing number of distributed applications are being constructed by composing them out of existing applications. The resulting applications can be very complex in structure,...
Frédéric Ranno, Santosh K. Shrivasta...
ASPDAC
2005
ACM
113views Hardware» more  ASPDAC 2005»
14 years 1 months ago
Scalable interprocedural register allocation for high level synthesis
Abstract— The success of classical high level synthesis has been limited by the complexity of the applications it can handle, typically not large enough to necessitate the depart...
Rami Beidas, Jianwen Zhu
PARA
1995
Springer
13 years 11 months ago
A Proposal for a Set of Parallel Basic Linear Algebra Subprograms
This paper describes a proposal for a set of Parallel Basic Linear Algebra Subprograms PBLAS. The PBLAS are targeted at distributed vector-vector, matrix-vector and matrixmatrix...
Jaeyoung Choi, Jack Dongarra, Susan Ostrouchov, An...