Sciweavers

112 search results - page 18 / 23
» Clustering Template Based Web Documents
Sort
View
WWW
2009
ACM
14 years 10 months ago
Extracting article text from the web with maximum subsequence segmentation
Much of the information on the Web is found in articles from online news outlets, magazines, encyclopedias, review collections, and other sources. However, extracting this content...
Jeff Pasternack, Dan Roth
SIGIR
2008
ACM
13 years 9 months ago
Towards breaking the quality curse.: a web-querying approach to web people search
Searching for people on the Web is one of the most common query types to the web search engines today. However, when a person name is queried, the returned webpages often contain ...
Dmitri V. Kalashnikov, Rabia Nuray-Turan, Sharad M...
TREC
2003
13 years 11 months ago
UMBC at TREC 12
Abstract. We present the results of UMBC’s participation in the Web and Novelty tracks. We explored various heuristics-based link analysis approaches to the Topic Distillation ta...
Srikanth Kallurkar, Yongmei Shi, R. Scott Cost, Ch...
ICDE
2003
IEEE
247views Database» more  ICDE 2003»
14 years 11 months ago
CLUSEQ: Efficient and Effective Sequence Clustering
Analyzing sequence data has become increasingly important recently in the area of biological sequences, text documents, web access logs, etc. In this paper, we investigate the pro...
Jiong Yang, Wei Wang 0010
MM
2004
ACM
173views Multimedia» more  MM 2004»
14 years 3 months ago
Cortina: a system for large-scale, content-based web image retrieval
Recent advances in processing and networking capabilities of computers have led to an accumulation of immense amounts of multimedia data such as images. One of the largest reposit...
Till Quack, Ullrich Mönich, Lars Thiele, B. S...