Genetic algorithms are often applied to combinatorial optimization problems, the most popular one probably being the traveling salesperson problem. In contrast to permutations used for TSP, the selection of a subset from a larger set has so far gained surprisingly little interest. One intriguing example of this type of problems occurs in diversity selection for virtual high throughput screening, where k molecules need to be selected from a set of n while optimizing certain constraints. In this paper we present a novel representation for k-subsets and several genetic operators for it. Categories and Subject Descriptors: I.2.8 [Artificial Intelligence]: Problem Solving, Control Methods, and Search — Heuristic methods General Terms: Algorithms
Thorsten Meinl, Michael R. Berthold