Instruction combining is an optimization to replace a sequence of instructions with a more efficient instruction yielding the same result in a fewer machine cycles. When we use it...
Sampling has been successfully used to identify performance optimization opportunities. We would like to apply similar techniques to check program correctness. Unfortunately, samp...
A coarse-grain multithreaded processor can effectively hide long memory latencies by quickly switching to an alternate task when the active task issues a memory request, improving...
Many optimization techniques, including several targeted specifically at embedded systems, depend on the ability to calculate the number of elements that satisfy certain conditio...
Sven Verdoolaege, Rachid Seghir, Kristof Beyls, Vi...
Many parallel applications from scientific computing use MPI global communication operations to collect or distribute data. Since the execution times of these communication opera...