Search Sciweavers | Sciweavers

500 search results - page 36 / 100

» Document frequency and term specificity

140

click to vote

WWW
2009
ACM

135views Internet Technology» more WWW 2009»

User-centric content freshness metrics for search engines

16 years 5 months ago

Download www2009.org

In order to return relevant search results, a search engine must keep its local repository synchronized to the Web, but it is usually impossible to attain perfect freshness. Hence...

Ali Dasdan, Xinh Huynh

claim paper

Read More »

142

click to vote

KDD
2005
ACM

163views Data Mining» more KDD 2005»

Web mining from competitors' websites

15 years 10 months ago

Download www.cs.uiuc.edu

This paper presents a framework for user-oriented text mining. It is then illustrated with an example of discovering knowledge from competitors’ websites. The knowledge to be di...

Xin Chen, Yi-fang Brook Wu

claim paper

Read More »

167

click to vote

ECML
2006
Springer

176views Machine Learning» more ECML 2006»

Distributional Features for Text Categorization

15 years 7 months ago

Download cs.nju.edu.cn

Abstract-- Text categorization is the task of assigning predefined categories to natural language text. With the widely used `bag of words' representation, previous researches...

Xiao-Bing Xue, Zhi-Hua Zhou

claim paper

Read More »

141

click to vote

ICDM
2006
IEEE

132views Data Mining» more ICDM 2006»

High Quality, Efficient Hierarchical Document Clustering Using Closed Interesting Itemsets

15 years 11 months ago

Download www.cs.columbia.edu

High dimensionality remains a significant challenge for document clustering. Recent approaches used frequent itemsets and closed frequent itemsets to reduce dimensionality, and to...

Hassan H. Malik, John R. Kender

claim paper

Read More »

163

click to vote

SIGIR
2005
ACM

156views Information Technology» more SIGIR 2005»

Title extraction from bodies of HTML documents and its application to web page retrieval

15 years 10 months ago

Download research.microsoft.com

This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...

Yunhua Hu, Guomao Xin, Ruihua Song, Guoping Hu, Sh...

claim paper

Read More »

« Prev « First page 36 / 100 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers