Good data layouts improve cache and TLB performance of object-oriented software, but unfortunately, selecting an optimal data layout a priori is NP-hard. This paper introduces layo...
—With the appearance of massively parallel and inexpensive platforms such as the G80 generation of NVIDIA GPUs, more real-life applications will be designed or ported to these pl...
This paper presents a novel architecture for domain-specific FPGA devices. This architecture can be optimised for both speed and density by exploiting domain-specific informatio...
Chun Hok Ho, Chi Wai Yu, Philip Heng Wai Leong, Wa...
With the trend towards increasing number of processor cores in future chip architectures, scalable directory-based protocols for maintaining cache coherence will be needed. Howeve...
Abstract—We present a set of modeling constructs accompanied by a high performance simulation kernel for accuracy adaptive transaction level models. In contrast to traditional, ...