We consider the problem of efficiently computing weighted proximity best-joins over multiple lists, with applications in information retrieval and extraction. We are given a multi-...
AnHai Doan, Haixun Wang, Hao He, Jun Yang 0001, Ri...
We present a hybrid method to turn off-the-shelf information retrieval (IR) systems into future event predictors. Given a query, a time series model is trained on the publication...
Link analysis is a key technology in contemporary web search engines. Most of the previous work on link analysis only used information from one snapshot of web graph. Since commer...
Lei Yang, Lei Qi, Yan-Ping Zhao, Bin Gao, Tie-Yan ...
We analyse the kinematics of probabilistic term weights at retrieval time for di erent Information Retrieval models. We present four models based on di erent notions of probabilis...
The detection of repeated subsequences, time series motifs, is a problem which has been shown to have great utility for several higher-level data mining algorithms, including clas...