Sciweavers

119 search results - page 5 / 24
» Learning to Extract Text-Based Information from the World Wi...
Sort
View
MAICS
2004
13 years 10 months ago
Intelligent Content Based Title and Author Name Extraction from Formatted Documents
This paper describes the development of algorithms for extracting the title and the names of the authors from documents available on the World Wide Web. In this paper we describe ...
Eric G. Berkowitz, Mohamed Reda Elkhadiri, Tim Sah...
JCDL
2004
ACM
198views Education» more  JCDL 2004»
14 years 2 months ago
Finding authoritative people from the web
Today’s web is so huge and diverse that it arguably reflects the real world. For this reason, searching the web is a promising approach to find things in the real world. This ...
Masanori Harada, Shin-ya Sato, Kazuhiro Kazama
WWW
2009
ACM
14 years 9 months ago
Incorporating site-level knowledge to extract structured data from web forums
Web forums have become an important data resource for many web applications, but extracting structured data from unstructured web forum pages is still a challenging task due to bo...
Jiang-Ming Yang, Rui Cai, Yida Wang, Jun Zhu, Lei ...
CHI
1996
ACM
14 years 24 days ago
Silk from a Sow's Ear: Extracting Usable Structures from the Web
In its current implementation, the World-Wide Web lacks much of the explicit structure and strong typing found in many closed hypertext systems. While this property has directly f...
Peter Pirolli, James E. Pitkow, Ramana Rao
DILS
2009
Springer
14 years 3 months ago
Site-Wide Wrapper Induction for Life Science Deep Web Databases
We present a novel approach to automatic information extraction from Deep Web Life Science databases using wrapper induction. Traditional wrapper induction techniques focus on lear...
Saqib Mir, Steffen Staab, Isabel Rojas