The high quality, structured data from Web structured sources is invaluable for many applications. Hidden Web databases are not directly crawlable by Web search engines and are on...
—The success of Web 2.0 communities demonstrates that the users of an information service are willing to participate in the content creation process on a voluntary basis. In this...
This work focuses on scenarios that require the storage of large amounts of data. Such systems require the ability to either continuously increase the storage space or reclaim spa...
The Web is a valuable source of language speci c resources but the process of collecting, organizing and utilizing these resources is di cult. We describe CorpusBuilder, an approa...
Abstract. This paper presents a simple unsupervised learning algorithm for recognizing synonyms, based on statistical data acquired by querying a Web search engine. The algorithm, ...