Data mining techniques are routinely used by fundraisers to select those prospects from a large pool of candidates who are most likely to make a financial contribution. These techniques often rely on statistical models based on trial performance data. This trial performance data is typically obtained by soliciting a smaller sample of the possible prospect pool. Collecting this trial data involves a cost; therefore the fundraiser is interested in keeping the trial size small while still collecting enough data to build a reliable statistical model that will be used to evaluate the remainder of the prospects. We describe an experimental design approach to optimally choose the trial prospects from an existing large pool of prospects. Prospects are clustered to render the problem practically tractable. We modify the standard D-optimality algorithm to prevent repeated selection of the same prospect cluster, since each prospect can only be solicited at most once. We assess the benefits of th...
Uwe F. Mayer, Armand Sarkissian