Sciweavers

502 search results - page 12 / 101
» Extracting Partial Structures from HTML Documents
Sort
View
IJCAI
2003
13 years 9 months ago
Expressive Power of Tree and String Based Wrappers
There exist two types of wrappers: the string based wrapper such as the LR wrapper, and the tree based wrapper. A tree based wrapper designates extraction regions by nodes on the ...
Daisuke Ikeda, Yasuhiro Yamada, Sachio Hirokawa
DIAL
2004
IEEE
156views Image Analysis» more  DIAL 2004»
14 years 5 days ago
Xed: A New Tool for eXtracting Hidden Structures from Electronic Documents
PDF became a very common format for exchanging printable documents. Further, it can be easily generated from the major documents formats, which make a huge number of PDF documents...
Karim Hadjar, Maurizio Rigamonti, Denis Lalanne, R...
ICDCSW
2003
IEEE
14 years 1 months ago
CATP: A Context-Aware Transportation Protocol for HTTP
— The rendering mechanism used in Web browsers have a significant impact on the user behavior and delay tolerance of retrieval. The head-of-line blocking phenomena prevents the ...
Huamin Chen, Prasant Mohapatra
MKM
2004
Springer
14 years 1 months ago
Extraction of Logical Structure from Articles in Mathematics
We propose a mathematical knowledge browser which helps people to read mathematical documents. By the browser printed mathematical documents can be scanned and recognized by OCR (O...
Koji Nakagawa, Akihiro Nomura, Masakazu Suzuki
DEXAW
2004
IEEE
130views Database» more  DEXAW 2004»
14 years 5 days ago
Data Extraction from Web Data Sources
This paper provides an explanation of the basic data structures used in a new page analysis technique to create wrappers (data extractors) for the result pages produced by web sit...
Jerome Robinson