Sciweavers

328 search results - page 36 / 66
» A Multi-level Approach for Document Clustering
Sort
View
CIKM
2011
Springer
12 years 8 months ago
Probabilistic near-duplicate detection using simhash
This paper offers a novel look at using a dimensionalityreduction technique called simhash [8] to detect similar document pairs in large-scale collections. We show that this algo...
Sadhan Sood, Dmitri Loguinov
ESWA
2008
213views more  ESWA 2008»
13 years 8 months ago
Visualization of patent analysis for emerging technology
Many methods have been developed to recognize those progresses of technologies, and one of them is to analyze patent information. And visualization methods are considered to be pr...
Young Gil Kim, Jong Hwan Suh, Sang-Chan Park
IRAL
2003
ACM
14 years 2 months ago
A practical text summarizer by paragraph extraction for Thai
In this paper, we propose a practical approach for extracting the most relevant paragraphs from the original document to form a summary for Thai text. The idea of our approach is ...
Chuleerat Jaruskulchai, Canasai Kruengkrai
ISI
2007
Springer
14 years 2 months ago
DOTS: Detection of Off-Topic Search via Result Clustering
— Often document dissemination is limited to a “need to know” basis so as to better maintain organizational trade secrets. Retrieving documents that are off-topic to a user...
Nazli Goharian, Alana Platt
WWW
2005
ACM
14 years 2 months ago
X-warehouse: building query pattern-driven data
In this paper, we propose an approach to materialize XML data warehouses based on the frequent query patterns discovered from historical queries issued by users. The schemas of in...
Ji Zhang, Wei Wang, Han Liu, Sheng Zhang