A critical step in all quartet methods for constructing evolutionary trees is the inference of the topology for each set of four sequences (i.e. quartet). It is a well–known fact that all quartet topology inference methods make mistakes that result in the incorrect inference of quartet topology. These mistakes are called quartet errors. In this paper, two efficient algorithms for correcting bounded numbers of quartet errors are presented. These “quartet cleaning” algorithms are shown to be optimal in that no algorithm can correct more quartet errors. An extensive simulation study reveals that sets of quartet topologies inferred by three popular methods (Neighbor Joining [15], Ordinal Quartet [14] and Maximum Parsimony [10]) almost always contain quartet errors and that a large portion of these quartet errors are corrected by the quartet cleaning algorithms.
Vincent Berry, Tao Jiang, Paul E. Kearney, Ming Li