With hundreds of millions of participants, social media services have become commonplace. Unlike a traditional social network service, a microblogging network like Twitter is a hy...
As user demands become increasingly sophisticated, search engines today are competing in more than just returning document results from the Web. One area of competition is providi...
We introduce a generative probabilistic document model based on latent Dirichlet allocation (LDA), to deal with textual errors in the document collection. Our model is inspired by...
We consider data exchange for XML documents: given source and target schemas, a mapping between them, and a document conforming to the source schema, construct a target document a...
We study the problem of detecting coordinated free text campaigns in large-scale social media. These campaigns – ranging from coordinated spam messages to promotional and advert...
Kyumin Lee, James Caverlee, Zhiyuan Cheng, Daniel ...
Powerful SIMD instructions in modern processors offer an opportunity for greater search performance. In this paper, we apply these instructions to decoding search engine posting ...
Alexander A. Stepanov, Anil R. Gangolli, Daniel E....
Various semi-supervised learning methods have been proposed recently to solve the long-standing shortage problem of manually labeled data in sentiment classification. However, mos...
In most of the cases, scientists depend on previous literature which is relevant to their research fields for developing new ideas. However, it is not wise, nor possible, to trac...
Rui Yan, Jie Tang, Xiaobing Liu, Dongdong Shan, Xi...
A continuous top-k query retrieves the k most preferred objects in a data stream according to a given preference function. These queries are important for a broad spectrum of appl...
Avani Shastri, Di Yang, Elke A. Rundensteiner, Mat...
Entity matching (EM) is the task of identifying records that refer to the same real-world entity from different data sources. While EM is widely used in data integration and data...