Sciweavers

223 search results - page 36 / 45
» An Efficient Computation of the Multiple-Bernoulli Language ...
Sort
View
WSC
1997
13 years 8 months ago
Integrating Distributed Simulation Objects
Creating comprehensive simulation models can be expensive and time consuming. This paper discusses our efforts to develop a general methodology that will allow users to quickly an...
Joseph A. Heim
ASPLOS
1996
ACM
13 years 11 months ago
An Integrated Compile-Time/Run-Time Software Distributed Shared Memory System
On a distributed memory machine, hand-coded message passing leads to the most efficient execution, but it is difficult to use. Parallelizing compilers can approach the performance...
Sandhya Dwarkadas, Alan L. Cox, Willy Zwaenepoel
ICS
2009
Tsinghua U.
14 years 2 months ago
High-performance CUDA kernel execution on FPGAs
In this work, we propose a new FPGA design flow that combines the CUDA programming model from Nvidia with the state of the art high-level synthesis tool AutoPilot from AutoESL, to...
Alexandros Papakonstantinou, Karthik Gururaj, John...
DAC
2005
ACM
14 years 8 months ago
Fine-grained application source code profiling for ASIP design
Current Application Specific Instruction set Processor (ASIP) design methodologies are mostly based on iterative architecture exploration that uses Architecture Description Langua...
Kingshuk Karuri, Mohammad Abdullah Al Faruque, Ste...
SIGOPS
2008
104views more  SIGOPS 2008»
13 years 7 months ago
PipesFS: fast Linux I/O in the unix tradition
This paper presents PipesFS, an I/O architecture for Linux 2.6 that increases I/O throughput and adds support for heterogeneous parallel processors by (1) collapsing many I/O inte...
Willem de Bruijn, Herbert Bos