An unsupervised clustering of the webpages on a website is a primary requirement for most wrapper induction and automated data extraction methods. Since page content can vary dras...
Understanding intents from search queries can improve a user’s search experience and boost a site’s advertising profits. Query tagging via statistical sequential labeling mode...
Ye-Yi Wang, Raphael Hoffmann, Xiao Li, Jakub Szyma...
The feature selection and weighting are two important parts of automatic text classification. In this paper we give a new method based on concept attributes. We use the DEF Terms o...
SA_MetaMatch, a component of the Standards Advisor (SA), is designed to find relevant documents through matching indices of metadata and document content. The elements in the meta...
This paper describes revised content-based search experiments in the context of TRECVID 2003 benchmark. Experiments focus on measuring content-based video retrieval performance wi...