In many text retrieval tasks, it is highly desirable to obtain a "similarity profile" of the document collection for a given query. We propose sampling-based techniques ...
As of today, the amount of data on the Semantic Web has grown considerably. The services for searching and browsing entities on the Semantic Web are in demand. To provide such ser...
Modern Day Marco Polos, young influencers in emerging markets who have more access to technology than their peers, often act as the gateway to new websites and technology in their...
This work proposes a novel cautious surfer to incorporate trust into the process of calculating authority for web pages. We evaluate a total of sixty queries over two large, real-...
We describe an adaptive method for extracting records from web pages. Our algorithm combines a weighted tree matching metric with clustering for obtaining data extraction patterns...