Sciweavers

34 search results - page 6 / 7
» Compiling scheme to JVM bytecode: : a performance study
Sort
View
CONCURRENCY
2006
140views more  CONCURRENCY 2006»
13 years 7 months ago
An efficient memory operations optimization technique for vector loops on Itanium 2 processors
To keep up with a large degree of instruction level parallelism (ILP), the Itanium 2 cache systems use a complex organization scheme: load/store queues, banking and interleaving. ...
William Jalby, Christophe Lemuet, Sid Ahmed Ali To...
ISCA
1995
IEEE
110views Hardware» more  ISCA 1995»
13 years 11 months ago
Optimization of Instruction Fetch Mechanisms for High Issue Rates
Recent superscalar processors issue four instructions per cycle. These processors are also powered by highly-parallel superscalar cores. The potential performance can only be expl...
Thomas M. Conte, Kishore N. Menezes, Patrick M. Mi...
ICCS
2004
Springer
14 years 23 days ago
Improving Geographical Locality of Data for Shared Memory Implementations of PDE Solvers
On cc-NUMA multi-processors, the non-uniformity of main memory latencies motivates the need for co-location of threads and data. We call this special form of data locality, geogra...
Henrik Löf, Markus Nordén, Sverker Hol...
TLCA
2009
Springer
14 years 1 months ago
Parametricity for Haskell with Imprecise Error Semantics
Types play an important role both in reasoning about Haskell and for its implementation. For example, the Glasgow Haskell Compiler performs certain fusion transformations that are...
Florian Stenger, Janis Voigtländer
ISCA
1999
IEEE
110views Hardware» more  ISCA 1999»
13 years 11 months ago
Decoupling Local Variable Accesses in a Wide-Issue Superscalar Processor
Providing adequate data bandwidth is extremely important for a wide-issue superscalar processor to achieve its full performance potential. Adding a large number of ports to a data...
Sangyeun Cho, Pen-Chung Yew, Gyungho Lee