Sciweavers

2827 search results - page 69 / 566
» Marking Text Documents
Sort
View
FLAIRS
2001
13 years 9 months ago
Extracting Partial Structures from HTML Documents
The new wrapper model for extractiong text data from HTML documents is introduced. The Kushmerick's wrapper class (Kusshmerick 2000) may be unsuccessful in the case that suff...
Hiroshi Sakamoto, Yoshitsugu Murakami, Hiroki Arim...
DRR
2011
12 years 7 months ago
Improved document image segmentation algorithm using multiresolution morphology
Page segmentation into text and non-text components is an essential preprocessing step before OCR operation. If this is not done properly, an OCR classification engine produces g...
Syed Saqib Bukhari, Faisal Shafait, Thomas M. Breu...
AIMSA
2008
Springer
14 years 2 months ago
Using Text Segmentation to Enhance the Cluster Hypothesis
An alternative way to tackle Information Retrieval, called Passage Retrieval, considers text fragments independently rather than assessing global relevance of documents. In such a ...
Sylvain Lamprier, Tassadit Amghar, Bernard Levrat,...
PAKM
1998
13 years 9 months ago
Knowledge Management: A Text Mining Approach
Knowledge Discovery in Databases (KDD), also known as data mining, focuses on the computerized exploration of large amounts of data and on the discovery of interesting patterns wi...
Ronen Feldman, Moshe Fresko, Haym Hirsh, Yonatan A...
EMNLP
2007
13 years 9 months ago
Topic Segmentation with Hybrid Document Indexing
We present a domain-independent unsupervised topic segmentation approach based on hybrid document indexing. Lexical chains have been successfully employed to evaluate lexical cohe...
Irina Matveeva, Gina-Anne Levow