Sciweavers

708 search results - page 30 / 142
» Identifying Content Blocks from Web Documents
Sort
View
IEAAIE
2003
Springer
14 years 1 months ago
Applying Semantic Links for Classifying Web Pages
Automatic hypertext classification is an essential technique for organizing vast amount of Internet Web pages or HTML documents. One the of problems in classifying Web pages is tha...
Ben Choi, Qing Guo
IJDAR
2002
108views more  IJDAR 2002»
13 years 8 months ago
Document understanding for a broad class of documents
We present a document analysis system able to assign logical labels and extract the reading order in a broad set of documents. All information sources, from geometric features and ...
Marco Aiello, Christof Monz, Leon Todoran
ICTAI
2009
IEEE
14 years 3 months ago
Classifying Sentence-Based Summaries of Web Documents
Text classification categories Web documents in large collections into predefined classes based on their contents. Unfortunately, the classification process can be time-consumi...
Maria Soledad Pera, Yiu-Kai Ng
IUI
2004
ACM
14 years 2 months ago
Identifying adaptation dimensions in digital talking books
We have developed an automatic DTB production platform [3], which is capable of flexibly generating different user interfaces for talking books. DTBs are built from digital copies ...
Carlos Duarte, Luís Carriço
ICDE
2004
IEEE
117views Database» more  ICDE 2004»
14 years 10 months ago
Probe, Cluster, and Discover: Focused Extraction of QA-Pagelets from the Deep Web
In this paper, we introduce the concept of a QA-Pagelet to refer to the content region in a dynamic page that contains query matches. We present THOR, a scalable and efficient min...
James Caverlee, Ling Liu, David Buttler