Sciweavers

643 search results - page 94 / 129
» Using Hardware Counters to Automatically Improve Memory Perf...
Sort
View
IEEEPACT
2006
IEEE
14 years 2 months ago
Branch predictor guided instruction decoding
Fast instruction decoding is a challenge for the design of CISC microprocessors. A well-known solution to overcome this problem is using a trace cache. It stores and fetches alrea...
Oliverio J. Santana, Ayose Falcón, Alex Ram...
LCN
2006
IEEE
14 years 2 months ago
Minimizing Cache Misses in an Event-driven Network Server: A Case Study of TUX
We analyze the performance of CPU-bound network servers and demonstrate experimentally that the degradation in the performance of these servers under highconcurrency workloads is ...
Sapan Bhatia, Charles Consel, Julia L. Lawall
IEEEPACT
2005
IEEE
14 years 2 months ago
An Event-Driven Multithreaded Dynamic Optimization Framework
Dynamic optimization has the potential to adapt the program’s behavior at run-time to deliver performance improvements over static optimization. Dynamic optimization systems usu...
Weifeng Zhang, Brad Calder, Dean M. Tullsen
FPGA
2005
ACM
195views FPGA» more  FPGA 2005»
14 years 2 months ago
Sparse Matrix-Vector multiplication on FPGAs
Floating-point Sparse Matrix-Vector Multiplication (SpMXV) is a key computational kernel in scientific and engineering applications. The poor data locality of sparse matrices sig...
Ling Zhuo, Viktor K. Prasanna
ICCAD
2008
IEEE
97views Hardware» more  ICCAD 2008»
14 years 5 months ago
Integrated code and data placement in two-dimensional mesh based chip multiprocessors
— As transistor sizes continue to shrink and the number of transistors per chip keeps increasing, chip multiprocessors (CMPs) are becoming a promising alternative to remain on th...
Taylan Yemliha, Shekhar Srikantaiah, Mahmut T. Kan...