Sciweavers

1294 search results - page 150 / 259
» FFT Compiler Techniques
Sort
View
PLDI
2009
ACM
14 years 3 months ago
Parallelizing sequential applications on commodity hardware using a low-cost software transactional memory
Multicore designs have emerged as the mainstream design paradigm for the microprocessor industry. Unfortunately, providing multiple cores does not directly translate into performa...
Mojtaba Mehrara, Jeff Hao, Po-Chun Hsu, Scott A. M...
IEEEPACT
2007
IEEE
14 years 3 months ago
Performance Portable Optimizations for Loops Containing Communication Operations
Effective use of communication networks is critical to the performance and scalability of parallel applications. Partitioned Global Address Space languages like UPC bring the pro...
Costin Iancu, Wei Chen, Katherine A. Yelick
ISPASS
2007
IEEE
14 years 3 months ago
Cross Binary Simulation Points
Architectures are usually compared by running the same workload on each architecture and comparing performance. When a single compiled binary of a program is executed on many diff...
Erez Perelman, Jeremy Lau, Harish Patil, Aamer Jal...
HPCC
2007
Springer
14 years 3 months ago
Optimizing Array Accesses in High Productivity Languages
One of the outcomes of DARPA’s HPCS program has been the creation of three new high productivity languages: Chapel, Fortress, and X10. While these languages have introduced impro...
Mackale Joyner, Zoran Budimlic, Vivek Sarkar
APCSAC
2005
IEEE
14 years 2 months ago
An Integrated Partitioning and Scheduling Based Branch Decoupling
Conditional branch induced control hazards cause significant performance loss in modern out-of-order superscalar processors. Dynamic branch prediction techniques help alleviate th...
Pramod Ramarao, Akhilesh Tyagi