In the era of multicores, many applications that tend to require substantial compute power and data crunching (aka Throughput Computing Applications) can now be run on desktop PCs...
Chi-Keung Luk, Ryan Newton, William Hasenplaugh, M...
Optimal network performance is critical to efficient parallel scaling for communication-bound applications on large machines. With wormhole routing, no-load latencies do not increa...
Abhinav Bhatele, Eric J. Bohm, Laxmikant V. Kal&ea...
DSP processors usually provide dedicated address generation units (AGUs) to assist address computation. By carefully allocating variables in the memory, DSP compilers take advanta...
Dynamic analysis is based on collecting data as the program runs. However, raw traces tend to be too voluminous and too unstructured to be used directly for visualization and unde...
Multiple wireless technologies are converging to run on personal handhelds. The plethora of communication standards next to the cost issues of deeper submicron processing require ...
A. C. H. Ng, J. W. Weijers, Miguel Glassee, Thomas...