Sciweavers

39 search results - page 6 / 8
» Compiling for SIMD Within a Register
Sort
View
SOFTWARE
2011
13 years 2 months ago
A Synergetic Approach to Throughput Computing on x86-Based Multicore Desktops
In the era of multicores, many applications that tend to require substantial compute power and data crunching (aka Throughput Computing Applications) can now be run on desktop PCs...
Chi-Keung Luk, Ryan Newton, William Hasenplaugh, M...
ASPLOS
2009
ACM
14 years 8 months ago
Architectural support for SWAR text processing with parallel bit streams: the inductive doubling principle
Parallel bit stream algorithms exploit the SWAR (SIMD within a register) capabilities of commodity processors in high-performance text processing applications such as UTF8 to UTF-...
Robert D. Cameron, Dan Lin
DATE
2008
IEEE
106views Hardware» more  DATE 2008»
14 years 1 months ago
Source-Level Timing Annotation and Simulation for a Heterogeneous Multiprocessor
A generic and retargetable tool flow is presented that enables the export of timing data from software running on a cycle-accurate Virtual Prototype (VP) to a concurrent function...
Trevor Meyerowitz, Alberto L. Sangiovanni-Vincente...
ESTIMEDIA
2006
Springer
13 years 11 months ago
Use of a Bit-true Data Flow Analysis for Processor-Specific Source Code Optimization
Nowadays, key characteristics of a processor's instruction set are only exploited in high-level languages by using inline assembly or compiler intrinsics. Inserting intrinsic...
Heiko Falk, Jens Wagner, André Schaefer
IEEEPACT
2002
IEEE
14 years 12 days ago
Optimizing Loop Performance for Clustered VLIW Architectures
Modern embedded systems often require high degrees of instruction-level parallelism (ILP) within strict constraints on power consumption and chip cost. Unfortunately, a high-perfo...
Yi Qian, Steve Carr, Philip H. Sweany