Sciweavers

180 search results - page 19 / 36
» A Method for Calculating Term Similarity on Large Document C...
Sort
View
CIKM
2011
Springer
12 years 8 months ago
Probabilistic near-duplicate detection using simhash
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Sadhan Sood, Dmitri Loguinov
WIDM
2004
ACM
14 years 2 months ago
User evaluation of the NASA technical report server recommendation service
We present the user evaluation of two recommendation server methodologies implemented for the NASA Technical Report Server (NTRS). One methodology for generating recommendations u...
Michael L. Nelson, Johan Bollen, JoAnne R. Calhoun...
CIKM
2004
Springer
14 years 2 months ago
Hierarchical document categorization with support vector machines
Automatically categorizing documents into pre-defined topic hierarchies or taxonomies is a crucial step in knowledge and content management. Standard machine learning techniques ...
Lijuan Cai, Thomas Hofmann
CIKM
2008
Springer
13 years 10 months ago
Answering general time sensitive queries
Time is an important dimension of relevance for a large number of searches, such as over blogs and news archives. So far, research on searching over such collections has largely f...
Wisam Dakka, Luis Gravano, Panagiotis G. Ipeirotis
CIKM
2008
Springer
13 years 10 months ago
Experiments with English-Persian text retrieval
As the number of non-English documents is increasing dramatically on the web nowadays, the study and design of information retrieval systems for these languages is very important....
Abolfazl AleAhmad, Hadi Amiri, Masoud Rahgozar, Fa...