We introduce a new approach to analyzing click logs by examining both the documents that are clicked and those that are bypassed--documents returned higher in the ordering of the ...
Atish Das Sarma, Sreenivas Gollapudi, Samuel Ieong
Publicly available biological resources form a complex maze of highly heterogeneous data sources, interconnected by navigational capabilities and applications. Although it offers ...
Large collections of documents are commonly created around a database, where a typical database schema may contain hundreds of tables and thousands of columns. We developed a syst...
Carlos Garcia-Alvarado, Carlos Ordonez, Zhibo Chen...
Document similarity search (i.e. query by example) aims to retrieve a ranked list of documents similar to a query document in a text corpus or on the Web. Most existing approaches...
This paper1 presents an empirical approach to mining parallel corpora. Conventional approaches use a readily available collection of comparable, nonparallel corpora to extract par...