Sciweavers

466 search results - page 34 / 94
» Scalable Feature Extraction from Noisy Documents
Sort
View
EMNLP
2007
13 years 11 months ago
Enhancing Single-Document Summarization by Combining RankNet and Third-Party Sources
We present a new approach to automatic summarization based on neural nets, called NetSum. We extract a set of features from each sentence that helps identify its importance in the...
Krysta Marie Svore, Lucy Vanderwende, Christopher ...
IPM
2007
149views more  IPM 2007»
13 years 9 months ago
Web page title extraction and its application
This paper is concerned with automatic extraction of titles from the bodies of HTML documents (web pages). Titles of HTML documents should be correctly defined in the title fields...
Yewei Xue, Yunhua Hu, Guomao Xin, Ruihua Song, Shu...
INFOSCALE
2007
ACM
13 years 11 months ago
Query-driven indexing for scalable peer-to-peer text retrieval
We present a query-driven algorithm for the distributed indexing of large document collections within structured P2P networks. To cope with bandwidth consumption that has been ide...
Gleb Skobeltsyn, Toan Luu, Ivana Podnar Zarko, Mar...
ICDAR
2007
IEEE
14 years 1 months ago
WEB Image Classification Based on the Fusion of Image and Text Classifiers
This paper presents a novel method for the classification of images that combines information extracted from the images and contextual information. The main hypothesis is that con...
Pedro R. Kalva, Fabrício Enembreck, Alessan...
ICADL
2004
Springer
162views Education» more  ICADL 2004»
14 years 3 months ago
Character Region Identification from Cover Images Using DTT
A robust character region identification approach is proposed here to deal with cover images using a differential top-hat transformation (DTT). The DTT is derived from morphologica...
Lixu Gu