We pose the question: how do we efficiently evaluate a join operator, distributed over a heterogeneous network? Our objective here is to optimize the delay of output tuples. We discuss key challenges involved in the distribution, namely how to partition the join operator, how to place the resulting partitions on the network, and how to route inputs values from sources to our operators. Our model revolves on one simple concept – exploiting locality. We consider data locality in the distributions of input data values, and network locality in the distribution of network distances between sites. We sketch strategies to partition the input data space, and instantiate a structured topology, consisting of operator replicas to whom to route tuples for processing. Finally, we briefly discuss implementation issues that require addressing to enable the networked join proposed here.