Extracting information from web pages is an important problem; it has several applications such as providing improved search results and construction of databases to serve user qu...
Paramveer S. Dhillon, Sundararajan Sellamanickam, ...
Cross Language Information Retrieval community has brought up search engines over multilingual corpora, and multilingual text categorization systems. In this paper, we focus on th...
—For a large and evolving software system, the project team could receive a large number of bug reports. Locating the source code files that need to be changed in order to fix th...
A bitext, or bilingual parallel corpus, consists of two texts, each one in a different language, that are mutual translations. Bitexts are very useful in linguistic engineering bec...
Abstract. Efficient nearest neighbor (NN) search techniques for highdimensional data are crucial to content-based image retrieval (CBIR). Traditional data structures (e.g., kd-tree...
Pengcheng Wu, Steven C. H. Hoi, Duc Dung Nguyen, Y...