Sciweavers

684 search results - page 3 / 137
» Extracting semantic structure of web documents using content...
Sort
View
DOCENG
2009
ACM
14 years 2 months ago
Object-level document analysis of PDF files
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Tamir Hassan
WWW
2005
ACM
14 years 8 months ago
Thresher: automating the unwrapping of semantic content from the World Wide Web
We describe Thresher, a system that lets non-technical users teach their browsers how to extract semantic web content from HTML documents on the World Wide Web. Users specify exam...
Andrew Hogue, David R. Karger
KDD
2002
ACM
148views Data Mining» more  KDD 2002»
14 years 8 months ago
Discovering informative content blocks from Web documents
In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...
Shian-Hua Lin, Jan-Ming Ho
MSV
2007
13 years 9 months ago
Visualizing Knowledge Domain Citation and Semantic Structure
- Researchers are faced with a wide range of tasks when interacting with the literature of a scientific field. These tasks range from determining the field’s seminal documents, f...
Richard H. Fowler, Kyle Picou, Wendy Fowler, Yavuz...
AVI
2000
13 years 8 months ago
A Modular Approach for Exploring the Semantic Structure of Technical Document Collections
The identification and analysis of an enterprise's knowledge available in a documented form is a key element of knowledge management. Visual methods which allow easy access t...
Andreas Becks, Stefan Sklorz, Matthias Jarke