Search Sciweavers | Sciweavers

60 search results - page 7 / 12

» Document overlap detection system for distributed digital li...

175

click to vote

COLING
2010

108views Computational Linguistics» more COLING 2010»

Large Scale Parallel Document Mining for Machine Translation

15 years 2 months ago

Download static.googleusercontent.com

A distributed system is described that reliably mines parallel text from large corpora. The approach can be regarded as cross-language near-duplicate detection, enabled by an init...

Jakob Uszkoreit, Jay Ponte, Ashok C. Popat, Moshe ...

claim paper

Read More »

268

click to vote

ICDM
2009
IEEE

151views Data Mining» more ICDM 2009»

TagLearner: A P2P Classifier Learning System from Collaboratively Tagged Text Documents

15 years 4 months ago

Download aurora.gmu.edu

The amount of text data on the Internet is growing at a very fast rate. Online text repositories for news agencies, digital libraries and other organizations currently store gigaan...

Haimonti Dutta, Xianshu Zhu, Tushar Mahule, Hillol...

claim paper

Read More »

201

click to vote

TOIS
2010

128views more TOIS 2010»

Learning author-topic models from text corpora

15 years 5 months ago

Download www.ics.uci.edu

We propose a new unsupervised learning technique for extracting information about authors and topics from large text collections. We model documents as if they were generated by a...

Michal Rosen-Zvi, Chaitanya Chemudugunta, Thomas L...

claim paper

Read More »

175

click to vote

ICPR
2008
IEEE

124views Computer Vision» more ICPR 2008»

A robust front page detection algorithm for large periodical collections

16 years 1 months ago

Download figment.cse.usf.edu

Large-scale digitization projects aimed at periodicals often have as input streams of completely unlabeled document images. In such situations, the results produced by the automat...

Iuliu Vasile Konya, Christoph Seibert, Sebastian G...

claim paper

Read More »

204

click to vote

SIGIR
2003
ACM

127views Information Technology» more SIGIR 2003»

Evaluating different methods of estimating retrieval quality for resource selection

16 years 9 days ago

Download www.is.informatik.uni-duisburg.de

In a federated digital library system, it is too expensive to query every accessible library. Resource selection is the task to decide to which libraries a query should be routed....

Henrik Nottelmann, Norbert Fuhr

claim paper

Read More »

« Prev « First page 7 / 12 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers