Sciweavers

309 search results - page 10 / 62
» Discovering informative content blocks from Web documents
Sort
View
SIGDOC
2004
ACM
14 years 27 days ago
Semantic thumbnails: a novel method for summarizing document collections
The concept of thumbnails is common in image representation. A thumbnail is a highly compressed version of an image that provides a small, yet complete visual representation to th...
Arijit Sengupta, Mehmet M. Dalkilic, James C. Cost...
WIDM
2004
ACM
14 years 27 days ago
Stylistic and lexical co-training for web block classification
Many applications which use web data extract information from a limited number of regions on a web page. As such, web page division into blocks and the subsequent block classifica...
Chee How Lee, Min-Yen Kan, Sandra Lai
ISCIS
2003
Springer
14 years 20 days ago
A Cooperative Paradigm for Fighting Information Overload
The Web is mainly processed by humans. The role of the machines is just to transmit and display the contents of the documents, barely being able to do something else. Nowadays ther...
Daniel Gayo-Avello, Darío Álvarez Gu...
RIAO
2007
13 years 9 months ago
From Layout to Semantic: a Reranking Model for Mapping Web Documents to Mediated XML Representations
Many documents on the Web are formated in a weakly structured format. Because of their weak semantic and because of the heterogeneity of their formats, the information conveyed by...
Guillaume Wisniewski, Patrick Gallinari
APWEB
2003
Springer
14 years 22 days ago
Extracting Content Structure for Web Pages Based on Visual Representation
Abstract. A new web content structure based on visual representation is proposed in this paper. Many web applications such as information retrieval, information extraction and auto...
Deng Cai, Shipeng Yu, Ji-Rong Wen, Wei-Ying Ma