In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
Due to the resource limitation in the data stream environment, it has been reported that answering user queries according to the wavelet synopsis of a stream is an essential abili...
The problem of identifying approximately duplicate objects in databases is an essential step for the information integration process. Most existing approaches have relied on gener...
Background: Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a u...
Previous research has shown that older adults performed worse in web search tasks, and attributed poorer performance to a decline in their cognitive abilities. We conducted a stud...
Jessie Chin, Wai-Tat Fu, Thomas George Kannampalli...