Sciweavers

684 search results - page 23 / 137
» Extracting semantic structure of web documents using content...
Sort
View
WWW
2010
ACM
14 years 3 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
ECIR
2009
Springer
13 years 6 months ago
Refining Keyword Queries for XML Retrieval by Combining Content and Structure
Abstract. The structural heterogeneity and complexity of XML repositories makes query formulation challenging for users who have little knowledge of XML. To assist its users, an XM...
Desislava Petkova, W. Bruce Croft, Yanlei Diao
WWW
2004
ACM
14 years 9 months ago
Fine-grained, structured configuration management for web projects
Researchers in Web engineering have regularly noted that existing Web application development environments provide little support for managing the evolution of Web applications. K...
Tien Nhut Nguyen, Ethan V. Munson, Cheng Thao
WWW
2009
ACM
14 years 9 months ago
Extracting article text from the web with maximum subsequence segmentation
Much of the information on the Web is found in articles from online news outlets, magazines, encyclopedias, review collections, and other sources. However, extracting this content...
Jeff Pasternack, Dan Roth
ASWC
2006
Springer
14 years 12 days ago
Web Services Analysis: Making Use of Web Service Composition and Annotation
Automated Web service composition and automated Web service annotation could be seen as complimentary methodologies. While automated annotation allows to extract Web service semant...
Peep Küngas, Mihhail Matskin