Sciweavers

SIGIR
2008
ACM

Deep classification in large-scale text hierarchies

13 years 11 months ago
Deep classification in large-scale text hierarchies
Most classification algorithms are best at categorizing the Web documents into a few categories, such as the top two levels in the Open Directory Project. Such a classification method does not give very detailed topic-related class information for the user because the first two levels are often too coarse. However, classification on a large-scale hierarchy is known to be intractable for many target categories with cross-link relationships among them. In this paper, we propose a novel deep-classification approach to categorize Web documents into categories in a large-scale taxonomy. The approach consists of two stages: a search stage and a classification stage. In the first stage, a category-search algorithm is used to acquire the category candidates for a given document. Based on the category candidates, we prune the large-scale hierarchy to focus our classification effort on a small subset of the original hierarchy. As a result, the classification model is trained on the small subset...
Gui-Rong Xue, Dikan Xing, Qiang Yang, Yong Yu
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2008
Where SIGIR
Authors Gui-Rong Xue, Dikan Xing, Qiang Yang, Yong Yu
Comments (0)