Many code analysis techniques for optimization, debugging, or parallelization need to perform runtime disambiguation of sets of addresses. Such operations can be supported efficie...
James Tuck, Wonsun Ahn, Luis Ceze, Josep Torrellas
Optimizations aimed at reducing the impact of memory operations on execution speed have long concentrated on improving cache performance. These efforts achieve a reasonable level...
Despite their superior performance for multimedia applications, vector processors have three limitations that hinder their widespread acceptance. First, the complexity and size of...
Modern network processors support high levels of parallelism in packet processing by supporting multiple threads that execute on a micro-engine. Threads switch context upon encoun...
R. Collins, Fernando Alegre, Xiaotong Zhuang, Sant...
Clustered ILP processors are characterized by a large number of non-centralized on-chip resources grouped into clusters. Traditional code generation schemes for these processors c...
Krishnan Kailas, Kemal Ebcioglu, Ashok K. Agrawala