Sciweavers

453 search results - page 76 / 91
» Execution and Cache Performance of the Scheduled Dataflow Ar...
Sort
View
ANCS
2007
ACM
14 years 24 days ago
Ruler: high-speed packet matching and rewriting on NPUs
Programming specialized network processors (NPU) is inherently difficult. Unlike mainstream processors where architectural features such as out-of-order execution and caches hide ...
Tomas Hruby, Kees van Reeuwijk, Herbert Bos
CODES
2006
IEEE
14 years 2 months ago
Integrated analysis of communicating tasks in MPSoCs
Predicting timing behavior is key to efficient embedded real-time system design and verification. Especially memory accesses and co-processor calls over shared communication net...
Simon Schliecker, Matthias Ivers, Rolf Ernst
MICRO
1995
IEEE
102views Hardware» more  MICRO 1995»
14 years 10 days ago
Zero-cycle loads: microarchitecture support for reducing load latency
Untolerated load instruction latencies often have a significant impact on overall program performance. As one means of mitigating this effect, we present an aggressive hardware-b...
Todd M. Austin, Gurindar S. Sohi
MICRO
2005
IEEE
114views Hardware» more  MICRO 2005»
14 years 2 months ago
Address-Indexed Memory Disambiguation and Store-to-Load Forwarding
This paper describes a scalable, low-complexity alternative to the conventional load/store queue (LSQ) for superscalar processors that execute load and store instructions speculat...
Sam S. Stone, Kevin M. Woley, Matthew I. Frank
IEEEPACT
1998
IEEE
14 years 1 months ago
Athapascan-1: On-Line Building Data Flow Graph in a Parallel Language
In order to achieve practical efficient execution on a parallel architecture, a knowledge of the data dependencies related to the application appears as the key point for building...
François Galilée, Jean-Louis Roch, G...