Sciweavers

87 search results - page 5 / 18
» Document zone content classification and its performance eva...
Sort
View
AUSAI
2001
Springer
13 years 11 months ago
Fast Text Classification Using Sequential Sampling Processes
A central problem in information retrieval is the automated classification of text documents. While many existing methods achieve good levels of performance, they generally require...
Michael D. Lee
SIGIR
2008
ACM
13 years 7 months ago
Classifiers without borders: incorporating fielded text from neighboring web pages
Accurate web page classification often depends crucially on information gained from neighboring pages in the local web graph. Prior work has exploited the class labels of nearby p...
Xiaoguang Qi, Brian D. Davison
IIWAS
2008
13 years 9 months ago
Combining content extraction heuristics: the CombinE system
The main text content of an HTML document on the WWW is typically surrounded by additional contents, such as navigation menus, advertisements, link lists or design elements. Conte...
Thomas Gottron
WWW
2008
ACM
14 years 8 months ago
Learning to rank relational objects and its application to web search
Learning to rank is a new statistical learning technology on creating a ranking model for sorting objects. The technology has been successfully applied to web search, and is becom...
Tao Qin, Tie-Yan Liu, Xu-Dong Zhang, De-Sheng Wang...
WWW
2006
ACM
14 years 8 months ago
A content and structure website mining model
We present a novel model for validating and improving the content and structure organization of a website. This model studies the website as a graph and evaluates its interconnect...
Barbara Poblete, Ricardo A. Baeza-Yates