Tightly coupled parallel applications are increasingly run in Grid environments. Unfortunately, on many Grid sites the ability of machines to create or accept network connections is severely limited by firewalls, network address translation (NAT) or non-routed networks. Multi homing further complicates connection setup and machine identification. Although ad-hoc solutions exist for some of these problems, it is usually up to the application’s user to discover the cause of the connectivity problems and find a solution. In this paper we describe SmartSockets,1 a communication library that lifts this burden by automatically discovering the connectivity problems and solving them with as little support from the user as possible. Categories and Subject Descriptors: C.2.4 [Distributed Systems]: Distributed applications General Terms: Algorithms, Design, Reliability
Jason Maassen, Henri E. Bal