As the Web has evolved into a data-rich repository, with the standard “page view,” current search engines are becoming increasingly inadequate for a wide range of query tasks....
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
We consider the problem of deep web source selection and argue that existing source selection methods are inadequate as they are based on local similarity assessment. Specificall...
Machine Learned Ranking approaches have shown successes in web search engines. With the increasing demands on developing effective ranking functions for different search domains, ...
Keke Chen, Rongqing Lu, C. K. Wong, Gordon Sun, La...
The structures of space and time are identified as essential for the realization of cognitive systems. It is suggested that the omnipresence of space and time may have been respons...