Sciweavers

332 search results - page 5 / 67
» Document Content Extraction Using Automatically Discovered F...
Sort
View
CIKM
2009
Springer
14 years 2 months ago
Automatic retrieval of similar content using search engine query interface
We consider the coverage testing problem where we are given a document and a corpus with a limited query interface and asked to find if the corpus contains a near-duplicate of th...
Ali Dasdan, Paolo D'Alberto, Santanu Kolay, Chris ...
JCDL
2003
ACM
160views Education» more  JCDL 2003»
14 years 22 days ago
Automatic Document Metadata Extraction Using Support Vector Machines
Automatic metadata generation provides scalability and usability for digital libraries and their collections. Machine learning methods offer robust and adaptable automatic metadat...
Hui Han, C. Lee Giles, Eren Manavoglu, Hongyuan Zh...
JCDL
2005
ACM
100views Education» more  JCDL 2005»
14 years 1 months ago
Automatic extraction of titles from general documents using machine learning
In this paper, we propose a machine learning approach to title extraction from general documents. By general documents, we mean documents that can belong to any one of a number of...
Yunhua Hu, Hang Li, Yunbo Cao, Dmitriy Meyerzon, Q...
DEXAW
2010
IEEE
181views Database» more  DEXAW 2010»
13 years 8 months ago
Towards a Search System for the Web Exploiting Spatial Data of a Web Document
In this paper, we describe our work in progress in the scope of information retrieval exploiting the spatial data extracted from web documents. We discuss problems of a search for ...
Stefan Dlugolinsky, Michal Laclavik, Ladislav Hluc...
ECIR
2007
Springer
13 years 9 months ago
Feature- and Query-Based Table of Contents Generation for XML Documents
The availability of a document’s logical structure in XML retrieval allows retrieval systems to return document portions (elements) instead of whole documents. This helps searche...
Zoltán Szlávik, Anastasios Tombros, ...