The Web has established itself as the dominant medium for doing electronic commerce. Consequently the number of service providers, both large and small, advertising their services on the web continues to proliferate. Such web presences can range from a simple reference to the service provider in a referral page containing many such references to a full-blown web site of the service provider. Creating queriable service directories by mining such web presences will add impetus to e-commerce activities on the web. In this paper we describe new extraction algorithms for mining service directories from web pages. Services are characterized by an ontology consisting of a taxonomy of service concepts, their associated attributes (such as names and addresses), type descriptions for the attributes and attribute identifier functions to locate occurrences of a service’s attributes in a web page. Two central extraction issues that arise in the mining of service directories from web documents a...
Hasan Davulcu, Saikat Mukherjee, I. V. Ramakrishna