Sciweavers

1437 search results - page 18 / 288
» Content Extraction Signatures
Sort
View
IIWAS
2008
13 years 8 months ago
Combining content extraction heuristics: the CombinE system
The main text content of an HTML document on the WWW is typically surrounded by additional contents, such as navigation menus, advertisements, link lists or design elements. Conte...
Thomas Gottron
WWW
2010
ACM
14 years 2 months ago
CETR: content extraction via tag ratios
We present Content Extraction via Tag Ratios (CETR) – a method to extract content text from diverse webpages by using the HTML document’s tag ratios. We describe how to comput...
Tim Weninger, William H. Hsu, Jiawei Han
ICIP
2006
IEEE
14 years 9 months ago
Content Extraction and Summarization of Instructional Videos
This paper presents a robust approach to extracting and summarizing the textual content of instructional videos for handwritten recognition, indexing and retrieval, and other elea...
Tiecheng Liu, Chekuri Choudary
FQAS
2006
Springer
84views Database» more  FQAS 2006»
13 years 11 months ago
UNL as a Text Content Representation Language for Information Extraction
This paper describes a new approach for describing contents through the use of interlinguas in order to facilitate the extraction of specific pieces of information. The authors hig...
Jesús Cardeñosa, Carolina Gallardo, ...
AINA
2009
IEEE
14 years 2 months ago
Learning to Extract Content from News Webpages
We consider the problem of content extraction from online news webpages. To explore to what extent the syntactic markup and the visual structure of a webpage facilitate the extrac...
Alex Spengler, Patrick Gallinari