Sciweavers

2827 search results - page 12 / 566
» Marking Text Documents
Sort
View
COLING
2002
13 years 7 months ago
Unknown Word Extraction for Chinese Documents
There is no blank to mark word boundaries in Chinese text. As a result, identifying words is difficult, because of segmentation ambiguities and occurrences of unknown words. Conve...
Keh-Jiann Chen, Wei-Yun Ma
KDD
2005
ACM
118views Data Mining» more  KDD 2005»
14 years 8 months ago
On the use of linear programming for unsupervised text classification
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Mark Sandler
AUSDM
2007
Springer
112views Data Mining» more  AUSDM 2007»
14 years 2 months ago
Measuring Data-Driven Ontology Changes using Text Mining
Most current ontology management systems concentrate on detecting usage-driven changes and representing changes formally in order to maintain the consistency. In this paper, we pr...
Majigsuren Enkhsaikhan, Wilson Wong, Wei Liu, Mark...
ICDAR
2003
IEEE
14 years 1 months ago
Text - Image Separation in Devanagari Documents
In this paper we present a top-down, projection-profile based algorithm to separate text blocks from image blocks in a Devanagari document. We use a distinctive feature of Devana...
Swapnil Khedekar, Vemulapati Ramanaprasad, Srirang...
JCDL
2005
ACM
94views Education» more  JCDL 2005»
14 years 1 months ago
xTagger: a new approach to authoring document-centric XML
The process of authoring document-centric XML documents in humanities disciplines is very different from the approach espoused by the standard XML editing software with the data-c...
Ionut Emil Iacob, Alex Dekhtyar