In this paper, we propose a new approach to automatically clustering e-commerce search engines (ESEs) on the Web such that ESEs in the same cluster sell similar products. This allows an ecommerce metasearch engine (comparison shopping system) to be built over the ESEs for each cluster. Our approach performs the clustering based on the features available on the interface page (i.e. the Web page containing the search form or interface) of each ESE. Special features that are utilized include the number of links, the number of images, terms appearing in the search form and normalized price terms. Our experimental results based on nearly 300 ESEs indicate that this approach can achieve good results. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval – Clustering; H.3.5: Online Information Services – Commercial Services, Web-based Services. General Terms Algorithms, Performance, Design, Experimentation. Keywords Search engine c...
Qian Peng, Weiyi Meng, Hai He, Clement T. Yu