Client interactions with modern web-accessible network services are typically organized into sessions involving multiple requests that read and write shared application data. Ther...
Machine-learned ranking techniques automatically learn a complex document ranking function given training data. These techniques have demonstrated the effectiveness and flexibilit...
Jing Bai, Fernando Diaz, Yi Chang, Zhaohui Zheng, ...
This paper presents a new web mining scheme for parallel data acquisition. Based on the Document Object Model (DOM), a web page is represented as a DOM tree. Then a DOM tree align...
In this paper, we present a fast and scalable Bayesian model for improving weakly annotated data – which is typically generated by a (semi) automated information extraction (IE) ...
Query translation for Cross-Lingual Information Retrieval (CLIR) has gained increasing attention in the research area. Previous work mainly used machine translation systems, bilin...
Rong Hu, Weizhu Chen, Jian Hu, Yansheng Lu, Zheng ...