Sciweavers

46 search results - page 3 / 10
» Data extraction as text categorization: an experiment with t...
Sort
View
ANLP
2000
126views more  ANLP 2000»
13 years 9 months ago
Compound Noun Segmentation Based on Lexical Data Extracted from Corpus
Compound noun analysis is one of the crucial problems in Korean language processing because a series of nouns in Korean may appear without white space in real texts, which makes i...
Juntae Yoon
ICONIP
2009
13 years 5 months ago
Text Mining with an Augmented Version of the Bisecting K-Means Algorithm
There is an ever increasing number of electronic documents available today and the task of organizing and categorizing this ever growing corpus of electronic documents has become t...
Yutaro Hatagami, Toshihiko Matsuka
ICDE
2012
IEEE
205views Database» more  ICDE 2012»
11 years 10 months ago
Optimizing Statistical Information Extraction Programs over Evolving Text
—Statistical information extraction (IE) programs are increasingly used to build real-world IE systems such as Alibaba, CiteSeer, Kylin, and YAGO. Current statistical IE approach...
Fei Chen, Xixuan Feng, Christopher Re, Min Wang
ECAI
2006
Springer
13 years 11 months ago
Text Sampling and Re-Sampling for Imbalanced Authorship Identification Cases
Authorship identification can be seen as a single-label multi-class text categorization problem. Very often, there are extremely few training texts at least for some of the candida...
Efstathios Stamatatos
KDD
2006
ACM
179views Data Mining» more  KDD 2006»
14 years 8 months ago
Extracting key-substring-group features for text classification
In many text classification applications, it is appealing to take every document as a string of characters rather than a bag of words. Previous research studies in this area mostl...
Dell Zhang, Wee Sun Lee