Search Sciweavers | Sciweavers

33

KDD
2002
ACM

148views Data Mining» more KDD 2002»

Discovering informative content blocks from Web documents

14 years 9 months ago

In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...

Shian-Hua Lin, Jan-Ming Ho

claim paper

Read More »

27

click to vote

WSDM
2009
ACM

187views Data Mining» more WSDM 2009»

Speeding up algorithms on compressed web graphs

14 years 4 months ago

Download wsdm2009.org

A variety of lossless compression schemes have been proposed to reduce the storage requirements of web graphs. One successful approach is virtual node compression [7], in which of...

Chinmay Karande, Kumar Chellapilla, Reid Andersen

claim paper

Read More »

23

click to vote

KDD
1998
ACM

80views Data Mining» more KDD 1998»

Human Performance on Clustering Web Pages: A Preliminary Study

14 years 1 months ago

Download www.research.rutgers.edu

With the increase in information on the World Wide Web it has become difficult to quickly find desired information without using multiple queries or using a topic-specific search ...

Sofus A. Macskassy, Arunava Banerjee, Brian D. Dav...

claim paper

Read More »

31

click to vote

CICLING
2009
Springer

335views Natural Language Processing» more CICLING 2009»

Language Identification on the Web: Extending the Dictionary Method

14 years 1 months ago

Download www.fi.muni.cz

Abstract. Automated language identification of written text is a wellestablished research domain that has received considerable attention in the past. By now, efficient and effecti...

Radim Rehurek, Milan Kolkus

claim paper

Read More »

32

click to vote

KDD
2002
ACM

293views Data Mining» more KDD 2002»

Automatic Categorization of Web Pages and User Clustering with Mixtures of Hidden Markov Models

14 years 9 months ago

Download www.snn.ru.nl

We propose mixtures of hidden Markov models for modelling clickstreams of web surfers. Hence, the page categorization is learned from the data without the need for a (possibly cumb...

Alexander Ypma, Tom Heskes

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers