Abstract. We address the problem of matching imperfectly documented schemas of data streams and large databases. Instancelevel schema matching algorithms identify likely correspondences between attributes by quantifying the similarity of their corresponding values. However, exact calculation of these similarities requires processing of all database records