Sciweavers

286 search results - page 43 / 58
» Cache Remapping to Improve the Performance of Tiled Algorith...
Sort
View
ICPP
1995
IEEE
13 years 12 months ago
Fusion of Loops for Parallelism and Locality
Loop fusion improves data locality and reduces synchronization in data-parallel applications. However, loop fusion is not always legal. Even when legal, fusion may introduce loop-...
Naraig Manjikian, Tarek S. Abdelrahman
ACMMSP
2005
ACM
129views Hardware» more  ACMMSP 2005»
14 years 2 months ago
A locality-improving dynamic memory allocator
In general-purpose applications, most data is dynamically allocated. The memory manager therefore plays a crucial role in application performance by determining the spatial locali...
Yi Feng, Emery D. Berger
TPDS
2002
94views more  TPDS 2002»
13 years 8 months ago
Recursive Array Layouts and Fast Matrix Multiplication
The performance of both serial and parallel implementations of matrix multiplication is highly sensitive to memory system behavior. False sharing and cache conflicts cause traditi...
Siddhartha Chatterjee, Alvin R. Lebeck, Praveen K....
CN
1999
87views more  CN 1999»
13 years 8 months ago
The Gecko NFS Web Proxy
The World-Wide Web provides remote access to pages using its own naming scheme (URLs), transfer protocol (HTTP), and cache algorithms. Not only does using these special-purpose me...
Scott M. Baker, John H. Hartman
POPL
2007
ACM
14 years 8 months ago
Locality approximation using time
Reuse distance (i.e. LRU stack distance) precisely characterizes program locality and has been a basic tool for memory system research since the 1970s. However, the high cost of m...
Xipeng Shen, Jonathan Shaw, Brian Meeker, Chen Din...