Wild populations of organism are often difficult to study in their natural settings. Often, it is possible to infer mating information about these species by genotyping the offspring and using the genetic information to infer sibling, and other kinship, relationships. While sibling reconstruction has been studied for a long time, none of the existing approaches have targeted scalability. In this paper, we introduce the first parallel approach to reconstructing sibling relationships from microsatellite markers. We use both functional and data domain decomposition to break down the problem and argue that this approach can be applied to other problems where columns are independent and simple constraint-based enumeration is required. We discuss algorithmic and implementation choices and their effects on results. We show that our approach is highly efficient and scalable.
Saad I. Sheikh, Ashfaq A. Khokhar, Tanya Y. Berger