Sciweavers

33 search results - page 2 / 7
» Using tree-grammars for training set expansion in page class...
Sort
View
AWIC
2003
Springer
14 years 24 days ago
Web Page Classification: A Soft Computing Approach
The Internet makes it possible to share and manipulate a vast quantity of information efficiently and effectively, but the rapid and chaotic growth experienced by the Net has gener...
Angela Ribeiro, Víctor Fresno, Maria C. Gar...
AIPRF
2007
13 years 9 months ago
Evaluation of Different Approaches to Training a Genre Classifier
This paper presents experiments on classifying web pages by genre. Firstly, a corpus of 1539 manually labeled web pages was prepared. Secondly, 502 genre features were selected ba...
Vedrana Vidulin, Mitja Lustrek, Matjaz Gams
HICSS
2009
IEEE
150views Biometrics» more  HICSS 2009»
14 years 2 months ago
An N-Gram Based Approach to Automatically Identifying Web Page Genre
The research reported in this paper is the first phase of a larger project on the automatic classification of web pages by their genres, using ngram representations of the web pag...
Jane E. Mason, Michael A. Shepherd, Jack Duffy
COLING
2010
13 years 2 months ago
Discriminative Training for Near-Synonym Substitution
Near-synonyms are useful knowledge resources for many natural language applications such as query expansion for information retrieval (IR) and paraphrasing for text generation. Ho...
Liang-Chih Yu, Hsiu-Min Shih, Yu-Ling Lai, Jui-Fen...
WWW
2009
ACM
14 years 8 months ago
Purely URL-based topic classification
Given only the URL of a web page, can we identify its topic? This is the question that we examine in this paper. Usually, web pages are classified using their content [7], but a U...
Eda Baykan, Monika Rauch Henzinger, Ludmila Marian...