Search Sciweavers | Sciweavers

139 search results - page 6 / 28

» An Approach to Identify Duplicated Web Pages

click to vote

ICAIL
2007
ACM

147views Artificial Intelligence» more ICAIL 2007»

Essential deduplication functions for transactional databases in law firms

13 years 11 months ago

Download www.conradweb.org

As massive document repositories and knowledge management systems continue to expand, in proprietary environments as well as on the Web, the need for duplicate detection becomes i...

Jack G. Conrad, Edward L. Raymond

claim paper

Read More »

click to vote

SIGIR
2004
ACM

136views Information Technology» more SIGIR 2004»

Constructing a text corpus for inexact duplicate detection

14 years 1 months ago

Download www.conradweb.org

As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...

Jack G. Conrad, Cindy P. Schriber

claim paper

Read More »

click to vote

CIKM
2009
Springer

158views Information Technology» more CIKM 2009»

Automatic generation of topic pages using query-based aspect models

14 years 2 months ago

Download research.microsoft.com

We investigate the automatic generation of topic pages as an alternative to the current Web search paradigm. We describe a general framework, which combines query log analysis to ...

Niranjan Balasubramanian, Silviu Cucerzan

claim paper

Read More »

click to vote

WWW
2004
ACM

157views Internet Technology» more WWW 2004»

Matching web site structure and content

14 years 8 months ago

Download www.iw3c2.org

To keep an overview of a complex corporate web sites, it is crucial to understand the relationship of contents, structure and the user's behavior. In this paper, we describe ...

Vassil Gedov, Carsten Stolz, Ralph Neuneier, Micha...

claim paper

Read More »

click to vote

WWW
2005
ACM

144views Internet Technology» more WWW 2005»

Finding the boundaries of information resources on the web

14 years 1 months ago

Download www2005.org

In recent years, many algorithms for the Web have been developed that work with information units distinct from individual web pages. These include segments of web pages or aggreg...

Pavel Dmitriev, Carl Lagoze, Boris Suchkov

claim paper

Read More »

« Prev « First page 6 / 28 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers