Sciweavers

289 search results - page 43 / 58
» On the Utility of Threads for Data Parallel Programming
Sort
View
DAC
2008
ACM
13 years 10 months ago
Application mapping for chip multiprocessors
The problem attacked in this paper is one of automatically mapping an application onto a Network-on-Chip (NoC) based chip multiprocessor (CMP) architecture in a locality-aware fas...
Guangyu Chen, Feihui Li, Seung Woo Son, Mahmut T. ...
SBACPAD
2008
IEEE
249views Hardware» more  SBACPAD 2008»
14 years 3 months ago
Processing Neocognitron of Face Recognition on High Performance Environment Based on GPU with CUDA Architecture
This work presents an implementation of Neocognitron Neural Network, using a high performance computing architecture based on GPU (Graphics Processing Unit). Neocognitron is an ar...
Gustavo Poli, José Hiroki Saito, Joã...
HPCN
1998
Springer
14 years 27 days ago
PARAFLOW: A Dataflow Distributed Data-Computing System
We describe the Paraflow system for connecting heterogeneous computing services together into a flexible and efficient data-mining metacomputer. There are three levels of parallel...
Roy Williams, Bruce Sears
IEEEPACT
2003
IEEE
14 years 1 months ago
Compiler-Directed Content-Aware Prefetching for Dynamic Data Structures
This paper describes Compiler-Directed Content-Aware Prefetching (CDCAP), an integrated compiler and hardware approach for prefetching dynamic data structures. The approach utiliz...
Hassan Al-Sukhni, Ian Bratt, Daniel A. Connors
IPPS
2007
IEEE
14 years 3 months ago
Software and Algorithms for Graph Queries on Multithreaded Architectures
Search-based graph queries, such as finding short paths and isomorphic subgraphs, are dominated by memory latency. If input graphs can be partitioned appropriately, large cluster...
Jonathan W. Berry, Bruce Hendrickson, Simon Kahan,...