Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

214

NDT
2010

222views Computer Networks» more NDT 2010»

Web Document Classification by Keywords Using Random Forests

15 years 5 months ago

Web Document Classification by Keywords Using Random Forests

Download public.clunet.edu

Web directory hierarchy is critical to serve user’s search request. Creating and maintaining such directories without human experts involvement requires good classification of web documents. In this paper, we explore web page classification using keywords from documents as attributes and using the random forest learning methods. Our initially results are promising that the random forests learning method performed better than several other well known learning methods. When the number of topics increased from five to seven, random forests still performed better than other methods even though absolute classification rates decreased.

Myungsook Klassen, Nikhila Paturi

Real-time Traffic

Computer Networks | Forest Learning Methods | NDT 2010 | Random Forests | Web Directory Hierarchy |

claim paper

Related Content

» Building pathway clusters from Random Forests classification using class votes

» Disambiguating authors in academic publications using random forests

» Simple Classification into Large Topic Ontology of Web Documents

» A comparison of machine learning techniques for phishing detection

» Exploring social annotations for web document classification

» Emotion Classification Using Web Blog Corpora

» Extraction and search of chemical formulae in text documents on the web

» Using web structure for classifying and describing web pages

» BayesTHMCRDR Algorithm for Automatic Classification of Web Document

Post Info
More Details (n/a)

Added	29 Jan 2011
Updated	29 Jan 2011
Type	Journal
Year	2010
Where	NDT
Authors	Myungsook Klassen, Nikhila Paturi

Comments (0)