Abstract—Recent improvements in high-throughput genotyping technology make possible genome-wide association studies and status prediction (classification) for common complex diseases. This paper addresses three challenges commonly facing such studies: (i) searching an enormous amount of possible gene interactions, (ii) validating reproducibility of associations and (iii) reliably predicting disease status. These challenges have been traditionally addressed in statistics while here we apply computational approaches – optimization and cross-validation. A complex risk factor is modeled as a subset of SNP’s with specified alleles and the optimization formulation asks for the one with the maximum odds ratio. When searching for disease associated risk factor, we show that greedy heuristics are much faster and lead to significantly better solutions than exhaustive heuristics in a reasonable amount of time. We propose a novel randomized complimentary greedy search method that is advan...