Sciweavers

330 search results - page 38 / 66
» Unexpected results in automatic list extraction on the web
Sort
View
PVLDB
2008
141views more  PVLDB 2008»
13 years 7 months ago
WebTables: exploring the power of tables on the web
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
BMCBI
2006
153views more  BMCBI 2006»
13 years 7 months ago
Automatic document classification of biological literature
Background: Document classification is a wide-spread problem with many applications, from organizing search engine snippets to spam filtering. We previously described Textpresso, ...
David Chen, Hans-Michael Müller, Paul W. Ster...
DOCENG
2009
ACM
14 years 2 months ago
Object-level document analysis of PDF files
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Tamir Hassan
CIKM
2006
Springer
13 years 11 months ago
Movie review mining and summarization
With the flourish of the Web, online review is becoming a more and more useful and important information resource for people. As a result, automatic review mining and summarizing ...
Li Zhuang, Feng Jing, Xiaoyan Zhu
CIKM
2003
Springer
14 years 28 days ago
Categorizing web queries according to geographical locality
Web pages (and resources, in general) can be characterized according to their geographical locality. For example, a web page with general information about wildflowers could be c...
Luis Gravano, Vasileios Hatzivassiloglou, Richard ...