Sciweavers

154 search results - page 29 / 31
» s-grams: Defining generalized n-grams for information retrie...
Sort
View
WWW
2007
ACM
14 years 9 months ago
The discoverability of the web
Previous studies have highlighted the high arrival rate of new content on the web. We study the extent to which this new content can be efficiently discovered by a crawler. Our st...
Anirban Dasgupta, Arpita Ghosh, Ravi Kumar, Christ...
KDD
2004
ACM
136views Data Mining» more  KDD 2004»
14 years 9 months ago
A cross-collection mixture model for comparative text mining
In this paper, we define and study a novel text mining problem, which we refer to as Comparative Text Mining (CTM). Given a set of comparable text collections, the task of compara...
ChengXiang Zhai, Atulya Velivelli, Bei Yu
SIGMOD
2008
ACM
107views Database» more  SIGMOD 2008»
14 years 9 months ago
Outlier-robust clustering using independent components
How can we efficiently find a clustering, i.e. a concise description of the cluster structure, of a given data set which contains an unknown number of clusters of different shape ...
Christian Böhm, Christos Faloutsos, Claudia P...
AFRIGRAPH
2007
ACM
14 years 23 days ago
Generic computation of bulletin boards into geometric kernels
Nowadays, many commercial CAD systems are built on proprietary geometric kernels which provide an API containing a set of high level geometric operations (boolean operations, slot...
Mehdi Baba-ali, David Marcheix, Xavier Skapin, Yve...
WWW
2003
ACM
14 years 9 months ago
Piazza: data management infrastructure for semantic web applications
The Semantic Web envisions a World Wide Web in which data is described with rich semantics and applications can pose complex queries. To this point, researchers have defined new l...
Alon Y. Halevy, Zachary G. Ives, Peter Mork, Igor ...