Search Sciweavers | Sciweavers

624 search results - page 24 / 125

» High Performance Matrix Multiplication on Many Cores

139

Voted

SC
1995
ACM

99views Applied Computing» more SC 1995»

Parallel Matrix-Vector Product Using Approximate Hierarchical Methods

15 years 7 months ago

Download www.chg.ru

Matrix-vector products (mat-vecs) form the core of iterative methods used for solving dense linear systems. Often, these systems arise in the solution of integral equations used i...

Ananth Grama, Vipin Kumar, Ahmed H. Sameh

claim paper

Read More »

143

click to vote

ISLPED
2003
ACM

122views Hardware» more ISLPED 2003»

A mixed-clock issue queue design for globally asynchronous, locally synchronous processor cores

15 years 9 months ago

Download www.ece.cmu.edu

Ever shrinking device sizes and innovative micro-architectural and circuit design techniques have made it possible to have multi-million transistor systems running at multi-gigahe...

Venkata Syam P. Rapaka, Diana Marculescu

claim paper

Read More »

129

click to vote

ISCA
2009
IEEE

276views Hardware» more ISCA 2009»

PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches

15 years 11 months ago

Download www.cc.gatech.edu

Many multi-core processors employ a large last-level cache (LLC) shared among the multiple cores. Past research has demonstrated that sharing-oblivious cache management policies (...

Yuejian Xie, Gabriel H. Loh

claim paper

Read More »

215

click to vote

ARC
2012
Springer

317views Hardware» more ARC 2012»

A High Throughput FPGA-Based Implementation of the Lanczos Method for the Symmetric Extremal Eigenvalue Problem

13 years 12 months ago

Download cas.ee.ic.ac.uk

Iterative numerical algorithms with high memory bandwidth requirements but medium-size data sets (matrix size ∼ a few 100s) are highly appropriate for FPGA acceleration. This pap...

Abid Rafique, Nachiket Kapre, George A. Constantin...

claim paper

Read More »

134

click to vote

ARC
2008
Springer

115views Hardware» more ARC 2008»

A High Throughput FPGA-based Floating Point Conjugate Gradient Implementation

15 years 6 months ago

Download cas.ee.ic.ac.uk

As Field Programmable Gate Arrays (FPGAs) have reached capacities beyond millions of equivalent gates, it becomes possible to accelerate floating-point scientific computing applica...

Antonio Roldao Lopes, George A. Constantinides

claim paper

Read More »

« Prev « First page 24 / 125 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers