Comparing and Combining Two Approaches to Automated Subject Classification of Text

15 years 10 months ago

Download www.it.lth.se

A machine-learning and a string-matching approach to automated subject classification of text were compared, as to their performance, advantages and downsides. The former approach was based on an SVM algorithm, while the latter comprised string-matching between a controlled vocabulary and words in the text to be classified. Data collection consisted of a subset from Compendex, classified into six different classes. It was shown that SVM on average outperforms the string-matching approach: our hypothesis that SVM yields better recall and string-matching better precision was confirmed only on one of the classes. The two approaches being complementary, we investigated different combinations of the two based on combining their vocabularies. The results have shown that the original approaches, i.e. machine-learning approach without using background knowledge from the controlled vocabulary, and string-matching approach based on controlled vocabulary, outperform approaches in which combinatio...

Koraljka Golub, Anders Ardö, Dunja Mladenic,

Real-time Traffic

Controlled Vocabulary | Education | ERCIMDL 2006 | String-matching Approach | String-matching Better Precision |

claim paper

» Automated Detection of Tumors in Mammograms Using Two Segments for Classification

» Text classification business intelligence and interactivity automating CSat analysis for s...

» Identifying comparative sentences in text documents

» Personal Sense and Idiolect Combining Authorship Attribution and Opinion Analysis

» Text classification improved through multigram models

» A novel refinement approach for text categorization

» A classfeaturecentroid classifier for text categorization

» Topicbridged PLSA for crossdomain text classification

Post Info
More Details (n/a)

Added	22 Aug 2010
Updated	22 Aug 2010
Type	Conference
Year	2006
Where	ERCIMDL
Authors	Koraljka Golub, Anders Ardö, Dunja Mladenic, Marko Grobelnik

Comments (0)

Sciweavers

Comparing and Combining Two Approaches to Automated Subject Classification of Text

Controlled Vocabulary | Education | ERCIMDL 2006 | String-matching Approach | String-matching Better Precision |

Explore & Download

Productivity Tools

Sciweavers