In this paper, we introduce an algebraic approach to the foundations of data mining. Our approach is based upon two algebras of functions de ned over a common state space X and a pairing between them. One algebra is an algebra of state space observations, and the other is an algebra of labeled sets of states. We interpret H as the algebraic encoding of the data and the pairing as the misclassi cation rate when the classifer f is applied to the set of states . In this paper, we give a realization theorem giving conditions on formal series of data sets built from D that implythere is a realization involving a state space X, a classi er f 2 R and a set of labeled states 2 R0 that yield this series.
Robert L. Grossman, Richard G. Larson