Information integration systems combine data from multiple heterogeneous Web services to answer complex user queries, provided a user has semantically modeled the service first. To model a service, the user has to specify semantic types of the input and output data it uses and its functionality. As large number of new services come online, it is impractical to require the user to come up with a semantic model of the service or rely on the service providers to conform to a standard. Instead, we would like to automatically learn the semantic model of a new service. This paper addresses one part of the problem: namely, automatically recognizing semantic types of the data used by Web services. We describe a metadatabased classification method for recognizing input data types using only the terms extracted from a Web Service Definition file. We then verify the classifier's predictions by invoking the service with some sample data of that type. Once we discover correct classification, ...
Kristina Lerman, Anon Plangprasopchok, Craig A. Kn