Sciweavers

829 search results - page 129 / 166
» Minimal document set retrieval
Sort
View
CIKM
2010
Springer
13 years 5 months ago
Combining link and content for collective active learning
In this paper, we study a novel problem Collective Active Learning, in which we aim to select a batch set of "informative" instances from a networking data set to query ...
Lixin Shi, Yuhang Zhao, Jie Tang
CIKM
2010
Springer
13 years 6 months ago
CiteData: a new multi-faceted dataset for evaluating personalized search performance
Personalized search systems have evolved to utilize heterogeneous features including document hyperlinks, category labels in various taxonomies and social tags in addition to free...
Abhay Harpale, Yiming Yang, Siddharth Gopal, Daqin...
KDD
2008
ACM
128views Data Mining» more  KDD 2008»
14 years 8 months ago
Scaling up text classification for large file systems
: We combine the speed and scalability of information retrieval with the generally superior classification accuracy offered by machine learning, yielding a two-phase text classifie...
George Forman, Shyamsundar Rajaram
WWW
2010
ACM
14 years 2 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
CIKM
2009
Springer
14 years 2 months ago
Identifying comparable entities on the web
Web search engines are often presented with user queries that involve comparisons of real-world entities. Thus far, this interaction has typically been captured by users submittin...
Alpa Jain, Patrick Pantel