Sciweavers

808 search results - page 145 / 162
» Keyword-based document clustering
Sort
View
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
14 years 2 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
SIGIR
2009
ACM
14 years 2 months ago
Compressing term positions in web indexes
Large search engines process thousands of queries per second on billions of pages, making query processing a major factor in their operating costs. This has led to a lot of resear...
Hao Yan, Shuai Ding, Torsten Suel
ICSE
2009
IEEE-ACM
14 years 8 days ago
A-SCORE: Automatic software component recommendation using coding context
Reusing software components (e.g. classes or modules) improves software quality and developer’s productivity. Unfortunately, developers may miss many reusing opportunities since...
Ryuji Shimada, Yasuhiro Hayase, Makoto Ichii, Mako...
COMAD
2009
13 years 8 months ago
Querying for relations from the semi-structured Web
We present a class of web queries whose result is a multi-column relation instead of a collection of unstructured documents as in standard web search. The user specifies the query...
Sunita Sarawagi
CIKM
2007
Springer
14 years 1 months ago
Regularized locality preserving indexing via spectral regression
We consider the problem of document indexing and representation. Recently, Locality Preserving Indexing (LPI) was proposed for learning a compact document subspace. Different from...
Deng Cai, Xiaofei He, Wei Vivian Zhang, Jiawei Han