Sciweavers

373 search results - page 13 / 75
» Correcting the Document Layout: A Machine Learning Approach
Sort
View
ISIWI
2000
13 years 10 months ago
Automatic Document Classification - A thorough Evaluation of various Methods
(Automatic) document classification is generally defined as content-based assignment of one or more predefined categories to documents. Usually, machine learning, statistical patt...
Christoph Goller, J. Löning, T. Will, W. Wolf...
FLAIRS
2007
13 years 11 months ago
Document Semantic Annotation for Intelligent Tutoring Systems: A Concept Mapping Approach
The difficulty of domain knowledge acquisition is one of the most sensible challenges of intelligent tutoring systems. Relying on domain experts and building domain models from sc...
Amal Zouaq, Roger Nkambou, Claude Frasson
CIKM
2005
Springer
14 years 2 months ago
Learning to summarise XML documents using content and structure
Documents formatted in eXtensible Markup Language (XML) are becoming increasingly available in collections of various document types. In this paper, we present an approach for the...
Massih-Reza Amini, Anastasios Tombros, Nicolas Usu...
CVPR
2009
IEEE
14 years 8 days ago
Robust unsupervised segmentation of degraded document images with topic models
Segmentation of document images remains a challenging vision problem. Although document images have a structured layout, capturing enough of it for segmentation can be difficult....
Timothy J. Burns, Jason J. Corso
SIGIR
2005
ACM
14 years 2 months ago
Title extraction from bodies of HTML documents and its application to web page retrieval
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
Yunhua Hu, Guomao Xin, Ruihua Song, Guoping Hu, Sh...