Sciweavers

967 search results - page 173 / 194
» Text Mining
Sort
View
SIGIR
2010
ACM
13 years 3 months ago
Efficient partial-duplicate detection based on sequence matching
With the ever-increasing growth of the Internet, numerous copies of documents become serious problem for search engine, opinion mining and many other web applications. Since parti...
Qi Zhang, Yue Zhang, Haomin Yu, Xuanjing Huang
SIGMOD
2008
ACM
157views Database» more  SIGMOD 2008»
14 years 8 months ago
CRD: fast co-clustering on large datasets utilizing sampling-based matrix decomposition
The problem of simultaneously clustering columns and rows (coclustering) arises in important applications, such as text data mining, microarray analysis, and recommendation system...
Feng Pan, Xiang Zhang, Wei Wang 0010
JCDL
2004
ACM
198views Education» more  JCDL 2004»
14 years 1 months ago
Finding authoritative people from the web
Today’s web is so huge and diverse that it arguably reflects the real world. For this reason, searching the web is a promising approach to find things in the real world. This ...
Masanori Harada, Shin-ya Sato, Kazuhiro Kazama
KDD
2007
ACM
237views Data Mining» more  KDD 2007»
14 years 8 months ago
Knowledge discovery of multiple-topic document using parametric mixture model with dirichlet prior
Documents, such as those seen on Wikipedia and Folksonomy, have tended to be assigned with multiple topics as a meta-data. Therefore, it is more and more important to analyze a re...
Issei Sato, Hiroshi Nakagawa
KDD
2010
ACM
247views Data Mining» more  KDD 2010»
13 years 10 months ago
Active learning for biomedical citation screening
Active learning (AL) is an increasingly popular strategy for mitigating the amount of labeled data required to train classifiers, thereby reducing annotator effort. We describe ...
Byron C. Wallace, Kevin Small, Carla E. Brodley, T...