We describe a novel simple and highly scalable semi-supervised method called Word-Class Distribution Learning (WCDL), and apply it the task of information extraction (IE) by utili...
Yanjun Qi, Ronan Collobert, Pavel Kuksa, Koray Kav...
This paper presents Dynamic IPL B+ -tree (d-IPL in short) as a B+ -tree index variant for flash-based storage systems. The d-IPL B+ -tree adopts a dynamic In-Page Logging (IPL) s...
We introduce a new theoretical derivation, evaluation methods, and extensive empirical analysis for an automatic query expansion framework in which model estimation is cast as a r...
Scientists often search for document-elements like tables, figures, or algorithm pseudo-codes. Domain scientists and researchers report important data, results and algorithms usi...
Structural indices play a significant role in improving the efficiency of XML query evaluation. Being able to compare various structural indexing techniques is critical for a DBM...
Yuqing Wu, Sofia Brenes, Tejas Totade, Shijin Josh...
Wikipedia is the largest monolithic repository of human knowledge. In addition to its sheer size, it represents a new encyclopedic paradigm by interconnecting articles through hyp...
Modern search engines have to be fast to satisfy users, so there are hard back-end latency requirements. The set of features useful for search ranking functions, though, continues...
Feng Pan, Tim Converse, David Ahn, Franco Salvetti...
Relevance Feedback has proven very effective for improving retrieval accuracy. A difficult yet important problem in all relevance feedback methods is how to optimally balance the...
We present a novel language-model-based approach to reranking an initially retrieved list so as to improve precision at top ranks. Our model integrates whole-document information ...