Sciweavers

30 search results - page 5 / 6
» Performance scalability of decoupled software pipelining
Sort
View
HPCA
2011
IEEE
12 years 11 months ago
HAQu: Hardware-accelerated queueing for fine-grained threading on a chip multiprocessor
Queues are commonly used in multithreaded programs for synchronization and communication. However, because software queues tend to be too expensive to support finegrained paralle...
Sanghoon Lee, Devesh Tiwari, Yan Solihin, James Tu...
SIGMETRICS
2002
ACM
13 years 7 months ago
Full-system timing-first simulation
Computer system designers often evaluate future design alternatives with detailed simulators that strive for functional fidelity (to execute relevant workloads) and performance fi...
Carl J. Mauer, Mark D. Hill, David A. Wood
FPGA
2009
ACM
343views FPGA» more  FPGA 2009»
14 years 2 months ago
Fpga-based face detection system using Haar classifiers
This paper presents a hardware architecture for face detection based system on AdaBoost algorithm using Haar features. We describe the hardware design techniques including image s...
Junguk Cho, Shahnam Mirzaei, Jason Oberg, Ryan Kas...
MICRO
2007
IEEE
168views Hardware» more  MICRO 2007»
14 years 1 months ago
Global Multi-Threaded Instruction Scheduling
Recently, the microprocessor industry has moved toward chip multiprocessor (CMP) designs as a means of utilizing the increasing transistor counts in the face of physical and micro...
Guilherme Ottoni, David I. August
ISCA
1995
IEEE
120views Hardware» more  ISCA 1995»
13 years 11 months ago
Streamlining Data Cache Access with Fast Address Calculation
For many programs, especially integer codes, untolerated load instruction latencies account for a significant portion of total execution time. In this paper, we present the desig...
Todd M. Austin, Dionisios N. Pnevmatikatos, Gurind...