Sciweavers

244 search results - page 29 / 49
» Optimizing Loop Performance for Clustered VLIW Architectures
Sort
View
FCCM
2011
IEEE
331views VLSI» more  FCCM 2011»
12 years 11 months ago
Synthesis of Platform Architectures from OpenCL Programs
—The problem of automatically generating hardware modules from a high level representation of an application has been at the research forefront in the last few years. In this pap...
Muhsen Owaida, Nikolaos Bellas, Konstantis Dalouka...
IPPS
2007
IEEE
14 years 2 months ago
POET: Parameterized Optimizations for Empirical Tuning
The excessive complexity of both machine architectures and applications have made it difficult for compilers to statically model and predict application behavior. This observatio...
Qing Yi, Keith Seymour, Haihang You, Richard W. Vu...
IWNAS
2008
IEEE
14 years 2 months ago
Software Barrier Performance on Dual Quad-Core Opterons
Multi-core processors based SMP servers have become building blocks for Linux clusters in recent years because they can deliver better performance for multi-threaded programs thro...
Jie Chen, William A. Watson III
TPDS
2002
198views more  TPDS 2002»
13 years 7 months ago
Orthogonal Striping and Mirroring in Distributed RAID for I/O-Centric Cluster Computing
This paper presents a new distributed disk-array architecture for achieving high I/O performance in scalable cluster computing. In a serverless cluster of computers, all distribute...
Kai Hwang, Hai Jin, Roy S. C. Ho
JPDC
2008
147views more  JPDC 2008»
13 years 7 months ago
A Grid-based Virtual Reactor: Parallel performance and adaptive load balancing
This paper addresses the problem of porting distributed parallel applications to the Grid. One of the challenges we address is the change from static homogeneous cluster environmen...
Vladimir Korkhov, Valeria V. Krzhizhanovskaya, Pet...