Sciweavers

340 search results - page 61 / 68
» On the Interplay of Parallelization, Program Performance, an...
Sort
View
IPPS
2010
IEEE
13 years 5 months ago
Improving numerical reproducibility and stability in large-scale numerical simulations on GPUs
The advent of general purpose graphics processing units (GPGPU's) brings about a whole new platform for running numerically intensive applications at high speeds. Their multi-...
Michela Taufer, Omar Padron, Philip Saponaro, Sand...
HIPC
1999
Springer
13 years 11 months ago
Microcaches
We describe a radically new cache architecture and demonstrate that it offers a huge reduction in cache cost, size and power consumption whilst maintaining performance on a wide ra...
David May, Dan Page, James Irwin, Henk L. Muller
ASPDAC
2007
ACM
119views Hardware» more  ASPDAC 2007»
13 years 11 months ago
Optimum Prefix Adders in a Comprehensive Area, Timing and Power Design Space
Parallel prefix adder is the most flexible and widely-used binary adder for ASIC designs. Many high-level synthesis techniques have been developed to find optimal prefix structures...
Jianhua Liu, Yi Zhu, Haikun Zhu, Chung-Kuan Cheng,...
ICPPW
2007
IEEE
14 years 1 months ago
Power Management of Multicore Multiple Voltage Embedded Systems by Task Scheduling
We study the role of task-level scheduling in power management on multicore multiple voltage embedded systems. Multicore on-achip, in particular DSP systems, can greatly improve p...
Gang Qu
HPCA
2009
IEEE
14 years 8 months ago
Prediction router: Yet another low latency on-chip router architecture
Network-on-Chips (NoCs) are quite latency sensitive, since their communication latency strongly affects the application performance on recent many-core architectures. To reduce th...
Hiroki Matsutani, Michihiro Koibuchi, Hideharu Ama...