Sciweavers

265 search results - page 17 / 53
» Scalable Text Retrieval for Large Digital Libraries
Sort
View
LREC
2010
160views Education» more  LREC 2010»
13 years 9 months ago
Corpus and Evaluation Measures for Automatic Plagiarism Detection
The simple access to texts on digital libraries and the WWW has led to an increased number of plagiarism cases in recent years, which renders manual plagiarism detection infeasibl...
Alberto Barrón-Cedeño, Martin Pottha...
KDD
2007
ACM
186views Data Mining» more  KDD 2007»
14 years 8 months ago
Content-based document routing and index partitioning for scalable similarity-based searches in a large corpus
We present a document routing and index partitioning scheme for scalable similarity-based search of documents in a large corpus. We consider the case when similarity-based search ...
Deepavali Bhagwat, Kave Eshghi, Pankaj Mehra
BMCBI
2005
152views more  BMCBI 2005»
13 years 7 months ago
Ranking the whole MEDLINE database according to a large training set using text indexing
Background: The MEDLINE database contains over 12 million references to scientific literature, ut 3/4 of recent articles including an abstract of the publication. Retrieval of ent...
Brian P. Suomela, Miguel A. Andrade
DBISP2P
2008
Springer
124views Database» more  DBISP2P 2008»
13 years 9 months ago
Exploiting Distribution Skew for Scalable P2P Text Clustering
K-Means clustering is widely used in information retrieval and data mining. Distributed K-Means variants have already been proposed, but none of the past algorithms scales to large...
Odysseas Papapetrou, Wolf Siberski, Fabian Leitrit...
CLEF
2009
Springer
13 years 5 months ago
Clustering for Text and Image-Based Photo Retrieval at CLEF 2009
For this year's Image CLEF Photo Retrieval task, we have prepared 5 submission runs to help us assess the effectiveness of 1) image content-based retrieval, and 2) textbased ...
Qian Zhu, Diana Inkpen