Classification of Web Documents Using a Graph Model

15 years 11 months ago

Download www.cse.salford.ac.uk

In this paper we describe work relating to classification of web documents using a graph-based model instead of the traditional vector-based model for document representation. We compare the classification accuracy of the vector model approach using the kNearest Neighbor (k-NN) algorithm to a novel approach which allows the use of graphs for document representation in the k-NN algorithm. The proposed method is evaluated on three different web document collections using the leave-one-out approach for measuring classification accuracy. The results show that the graph-based k-NN approach can outperform traditional vector-based k-NN methods in terms of both accuracy and execution time.

Adam Schenker, Mark Last, Horst Bunke, Abraham Kan

Real-time Traffic

Classification Accuracy | Document Analysis | Document Representation | ICDAR 2003 | Traditional Vector-based Model |

claim paper

» Improving web page classification by labelpropagation over click graphs

» Phrasebased Document Similarity Based on an Index Graph Model

» StructureBased Document Model with Discrete Wavelet Transforms and Its Application to Docu...

» Using a Layered Markov Model for Distributed Web Ranking Computation

» Image classification using the web graph

» Classification of model transformation techniques used in UMLbased Web engineering

» Semisupervised Document Classification with a Mislabeling Error Model

» Commentsoriented document summarization understanding documents with readers feedback

Post Info
More Details (n/a)

Added	04 Jul 2010
Updated	04 Jul 2010
Type	Conference
Year	2003
Where	ICDAR
Authors	Adam Schenker, Mark Last, Horst Bunke, Abraham Kandel

Comments (0)

Sciweavers

Classification of Web Documents Using a Graph Model

Classification Accuracy | Document Analysis | Document Representation | ICDAR 2003 | Traditional Vector-based Model |

Explore & Download

Productivity Tools

Sciweavers