Automated text categorization is an important technique for many web applications, such as document indexing, document filtering, and cataloging web resources. Many different appr...
The Web offers rich relational data with different semantics. In this paper, we address the problem of document recommendation in a digital library, where the documents in questio...
Ding Zhou, Shenghuo Zhu, Kai Yu, Xiaodan Song, Bel...
We study in this paper the Web forum crawling problem, which is a very fundamental step in many Web applications, such as search engine and Web data mining. As a typical user-crea...
Rui Cai, Jiang-Ming Yang, Wei Lai, Yida Wang, Lei ...
We address the problem of identifying the domain of online databases. More precisely, given a set F of Web forms automatically gathered by a focused crawler and an online database...
We consider the problem of indexing high-dimensional data for answering (approximate) similarity-search queries. Similarity indexes prove to be important in a wide variety of sett...