Sciweavers

708 search results - page 48 / 142
» Identifying Content Blocks from Web Documents
Sort
View
ICMCS
2007
IEEE
183views Multimedia» more  ICMCS 2007»
14 years 3 months ago
Web Page Segmentation Based on Gestalt Theory
Automatic web page segmentation is the basis to adaptive web browsing on mobile devices. It breaks a large page into smaller blocks, in which contents with coherent semantics are ...
Peifeng Xiang, Xin Yang, Yuanchun Shi
WWW
2008
ACM
14 years 9 months ago
Sailer: an effective search engine for unified retrieval of heterogeneous xml and web documents
This paper studies the problem of unified ranked retrieval of heterogeneous XML documents and Web data. We propose an effective search engine called Sailer to adaptively and versa...
Guoliang Li, Jianhua Feng, Jianyong Wang, Xiaoming...
VLDB
2007
ACM
118views Database» more  VLDB 2007»
14 years 9 months ago
Inferring XML Schema Definitions from XML Data
Although the presence of a schema enables many optimizations for operations on XML documents, recent studies have shown that many XML documents in practice either do not refer to ...
Geert Jan Bex, Frank Neven, Stijn Vansummeren
WEBI
2005
Springer
14 years 2 months ago
Automated Metadata and Instance Extraction from News Web Sites
In this paper, we present automated techniques for extracting metadata instance information by organizing and mining a set of news Web sites. We develop algorithms that detect and...
Srinivas Vadrevu, Saravanakumar Nagarajan, Fatih G...
SETN
2010
Springer
14 years 3 months ago
Scalable Semantic Annotation of Text Using Lexical and Web Resources
Abstract. In this paper we are dealing with the task of adding domainspecific semantic tags to a document, based solely on the domain ontology and generic lexical and Web resource...
Elias Zavitsanos, George Tsatsaronis, Iraklis Varl...