Search Sciweavers | Sciweavers

289 search results - page 13 / 58

» Postal Address Detection from Web Documents

117

Voted

ICDAR
2009
IEEE

136views Document Analysis» more ICDAR 2009»

Identification of Very Similar Filled-in Forms with a Reject Option

15 years 1 months ago

Download www.cvc.uab.es

In this work, a technique addressed to the reliable identification of very similar filled-in forms, with a reject option, is proposed. The method is based on the automatic detecti...

Joaquim Arlandis, Juan Carlos Pérez-Cortes,...

claim paper

Read More »

229

click to vote

VLDB
2003
ACM

125views Database» more VLDB 2003»

THESUS: Organizing Web document collections based on link semantics

16 years 3 months ago

Download www.db-net.aueb.gr

Abstract. The requirements for effective search and management of the WWW are stronger than ever. Currently Web documents are classified based on their content not taking into acco...

Maria Halkidi, Benjamin Nguyen, Iraklis Varlamis, ...

claim paper

Read More »

142

click to vote

ICDE
2005
IEEE

126views Database» more ICDE 2005»

WEBVIGIL: Monitoring Multiple Web Pages and Presentation of XML Pages

15 years 9 months ago

Download web.mst.edu

In the case of large-scale distributed environments such as the Internet, users are interested in monitoring changes to a particular web page (XML or HTML). There are many instanc...

Shravan Chamakura, Alpa Sachde, Sharma Chakravarth...

claim paper

Read More »

144

click to vote

HT
2003
ACM

131views Internet Technology» more HT 2003»

Enhanced web document summarization using hyperlinks

15 years 8 months ago

Download www.mariapinto.es

This paper addresses the issue of Web document summarization. As textual content of Web documents is often scarce or irrelevant and existing summarization techniques are based on ...

Jean-Yves Delort, Bernadette Bouchon-Meunier, Mari...

claim paper

Read More »

123

Voted

COLING
2010

108views Computational Linguistics» more COLING 2010»

Large Scale Parallel Document Mining for Machine Translation

14 years 10 months ago

Download static.googleusercontent.com

A distributed system is described that reliably mines parallel text from large corpora. The approach can be regarded as cross-language near-duplicate detection, enabled by an init...

Jakob Uszkoreit, Jay Ponte, Ashok C. Popat, Moshe ...

claim paper

Read More »

« Prev « First page 13 / 58 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers