Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

183

INCDM
2010
Springer

125views Data Mining» more INCDM 2010»

Web-Site Boundary Detection

15 years 8 months ago

Web-Site Boundary Detection

Download www.csc.liv.ac.uk

Deﬁning the boundaries of a web-site, for (say) archiving or information retrieval purposes, is an important but complicated task. In this paper a web-page clustering approach to boundary detection is suggested. The principal issue is feature selection, hampered by the observation that there is no clear understanding of what a web-site is. This paper proposes a deﬁnition of a web-site, founded on the principle of user intention, directed at the boundary detection problem; and then reports on a sequence of experiments, using a number of clustering techniques, and a wide range of features and combinations of features to identify web-site boundaries. The preliminary results reported seem to indicate that, in general, a combination of features produces the most appropriate result.

Ayesh Alshukri, Frans Coenen, Michele Zito

Real-time Traffic

Boundary Detection | Boundary Detection Problem | Data Mining | INCDM 2010 | Web-page Clustering Approach |

claim paper

Related Content

» Usability Studies of WWW Sites Heuristic Evaluation vs Laboratory Testing

» Detection of Web Subsites Concepts Algorithms and Evaluation Issues

» Beyond blacklists learning to detect malicious web sites from suspicious URLs

» A coclassification framework for detecting web spam and spammers in social media web sites

» Cantina a contentbased approach to detecting phishing web sites

» Web Canary A Virtualized Web Browser to Support LargeScale Silent Collaboration in Detecti...

» Automated Metadata and Instance Extraction from News Web Sites

» Modeling User Behaviour Aware WebSites with PRML

» The evolution of a manufacturing Web site

Post Info
More Details (n/a)

Added	12 Oct 2010
Updated	12 Oct 2010
Type	Conference
Year	2010
Where	INCDM
Authors	Ayesh Alshukri, Frans Coenen, Michele Zito

Comments (0)