Sciweavers

2827 search results - page 12 / 566
» Marking Text Documents
Sort
View
COLING
2002
15 years 2 months ago
Unknown Word Extraction for Chinese Documents
There is no blank to mark word boundaries in Chinese text. As a result, identifying words is difficult, because of segmentation ambiguities and occurrences of unknown words. Conve...
Keh-Jiann Chen, Wei-Yun Ma
KDD
2005
ACM
118views Data Mining» more  KDD 2005»
16 years 3 months ago
On the use of linear programming for unsupervised text classification
We propose a new algorithm for dimensionality reduction and unsupervised text classification. We use mixture models as underlying process of generating corpus and utilize a novel,...
Mark Sandler
AUSDM
2007
Springer
112views Data Mining» more  AUSDM 2007»
15 years 9 months ago
Measuring Data-Driven Ontology Changes using Text Mining
Most current ontology management systems concentrate on detecting usage-driven changes and representing changes formally in order to maintain the consistency. In this paper, we pr...
Majigsuren Enkhsaikhan, Wilson Wong, Wei Liu, Mark...
109
Voted
ICDAR
2003
IEEE
15 years 8 months ago
Text - Image Separation in Devanagari Documents
In this paper we present a top-down, projection-profile based algorithm to separate text blocks from image blocks in a Devanagari document. We use a distinctive feature of Devana...
Swapnil Khedekar, Vemulapati Ramanaprasad, Srirang...
JCDL
2005
ACM
94views Education» more  JCDL 2005»
15 years 8 months ago
xTagger: a new approach to authoring document-centric XML
The process of authoring document-centric XML documents in humanities disciplines is very different from the approach espoused by the standard XML editing software with the data-c...
Ionut Emil Iacob, Alex Dekhtyar