Sciweavers

75 search results - page 13 / 15
» Interprocedural transformations for parallel code generation
Sort
View
ASPLOS
2010
ACM
14 years 5 months ago
MacroSS: macro-SIMDization of streaming applications
SIMD (Single Instruction, Multiple Data) engines are an essential part of the processors in various computing markets, from servers to the embedded domain. Although SIMD-enabled a...
Amir Hormati, Yoonseo Choi, Mark Woh, Manjunath Ku...
CGO
2008
IEEE
14 years 5 months ago
Parallel-stage decoupled software pipelining
In recent years, the microprocessor industry has embraced chip multiprocessors (CMPs), also known as multi-core architectures, as the dominant design paradigm. For existing and ne...
Easwaran Raman, Guilherme Ottoni, Arun Raman, Matt...
CASES
2006
ACM
14 years 4 months ago
High-level languages for small devices: a case study
In this paper we study, through a concrete case, the feasibility of using a high-level, general-purpose logic language in the design and implementation of applications targeting w...
Manuel Carro, José F. Morales, Henk L. Mull...
HPCC
2009
Springer
14 years 3 months ago
On Instruction-Level Method for Reducing Cache Penalties in Embedded VLIW Processors
Usual cache optimisation techniques for high performance computing are difficult to apply in embedded VLIW applications. First, embedded applications are not always well structur...
Samir Ammenouche, Sid Ahmed Ali Touati, William Ja...
FCCM
2011
IEEE
331views VLSI» more  FCCM 2011»
13 years 2 months ago
Synthesis of Platform Architectures from OpenCL Programs
—The problem of automatically generating hardware modules from a high level representation of an application has been at the research forefront in the last few years. In this pap...
Muhsen Owaida, Nikolaos Bellas, Konstantis Dalouka...