Sciweavers

174 search results - page 31 / 35
» Effectiveness of Rich Document Representation in XML Retriev...
Sort
View
SIGMOD
2010
ACM
250views Database» more  SIGMOD 2010»
13 years 7 months ago
Expressive and flexible access to web-extracted data: a keyword-based structured query language
Automated extraction of structured data from Web sources often leads to large heterogeneous knowledge bases (KB), with data and schema items numbering in the hundreds of thousands...
Jeffrey Pound, Ihab F. Ilyas, Grant E. Weddell
CIKM
2007
Springer
14 years 1 months ago
Regularized locality preserving indexing via spectral regression
We consider the problem of document indexing and representation. Recently, Locality Preserving Indexing (LPI) was proposed for learning a compact document subspace. Different from...
Deng Cai, Xiaofei He, Wei Vivian Zhang, Jiawei Han
IJCAI
2007
13 years 9 months ago
Web Page Clustering Using Heuristic Search in the Web Graph
Effective representation of Web search results remains an open problem in the Information Retrieval community. For ambiguous queries, a traditional approach is to organize search ...
Ron Bekkerman, Shlomo Zilberstein, James Allan
PVLDB
2008
99views more  PVLDB 2008»
13 years 7 months ago
Industry-scale duplicate detection
Duplicate detection is the process of identifying multiple representations of a same real-world object in a data source. Duplicate detection is a problem of critical importance in...
Melanie Weis, Felix Naumann, Ulrich Jehle, Jens Lu...
DOCENG
2006
ACM
14 years 1 months ago
Content based SMS spam filtering
In the recent years, we have witnessed a dramatic increment in the volume of spam email. Other related forms of spam are increasingly revealing as a problem of importance, special...
José María Gómez Hidalgo, Gui...