Search Sciweavers | Sciweavers

103 search results - page 5 / 21

» Models and Algorithms for Duplicate Document Detection

159

click to vote

SIGIR
2008
ACM

97views Information Technology» more SIGIR 2008»

Local text reuse detection

15 years 6 months ago

Download goanna.cs.rmit.edu.au

Text reuse occurs in many different types of documents and for many different reasons. One form of reuse, duplicate or near-duplicate documents, has been a focus of researchers be...

Jangwon Seo, W. Bruce Croft

claim paper

Read More »

167

click to vote

SIGIR
2000
ACM

137views Information Technology» more SIGIR 2000»

An investigation of linguistic features and clustering algorithms for topical document clustering

15 years 11 months ago

Download www.cs.columbia.edu

We investigate four hierarchical clustering methods (single-link, complete-link, groupwise-average, and single-pass) and two linguistically motivated text features (noun phrase he...

Vasileios Hatzivassiloglou, Luis Gravano, Ankineed...

claim paper

Read More »

161

click to vote

ICPR
2008
IEEE

124views Computer Vision» more ICPR 2008»

A robust front page detection algorithm for large periodical collections

16 years 1 months ago

Download figment.cse.usf.edu

Large-scale digitization projects aimed at periodicals often have as input streams of completely unlabeled document images. In such situations, the results produced by the automat...

Iuliu Vasile Konya, Christoph Seibert, Sebastian G...

claim paper

Read More »

253

click to vote

MSR
2011
ACM

221views Software Engineering» more MSR 2011»

Modeling the evolution of topics in source code histories

14 years 9 months ago

Download sail.cs.queensu.ca

Studying the evolution of topics (collections of co-occurring words) in a software project is an emerging technique to automatically shed light on how the project is changing over...

Stephen W. Thomas, Bram Adams, Ahmed E. Hassan, Do...

claim paper

Read More »

153

click to vote

ICDAR
2009
IEEE

189views Document Analysis» more ICDAR 2009»

Clutter Noise Removal in Binary Document Images

15 years 4 months ago

Download lampsrv02.umiacs.umd.edu

The paper presents a clutter detection and removal algorithm for complex document images. The distance transform based approach is independent of clutter's position, size, sh...

Mudit Agrawal, David S. Doermann

claim paper

Read More »

« Prev « First page 5 / 21 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers