Search Sciweavers | Sciweavers

502 search results - page 34 / 101

» Extracting Partial Structures from HTML Documents

198

click to vote

ECIR
2010
Springer

173views Information Technology» more ECIR 2010»

Extracting Multilingual Topics from Unaligned Comparable Corpora

15 years 8 months ago

Download www.umiacs.umd.edu

Topic models have been studied extensively in the context of monolingual corpora. Though there are some attempts to mine topical structure from cross-lingual corpora, they require ...

Jagadeesh Jagarlamudi, Hal Daumé III

claim paper

Read More »

184

click to vote

VLDB
2011
ACM

251views Database» more VLDB 2011»

Harvesting relational tables from lists on the web

15 years 1 months ago

Download www.vldb.org

A large number of web pages contain data structured in the form of “lists”. Many such lists can be further split into multi-column tables, which can then be used in more seman...

Hazem Elmeleegy, Jayant Madhavan, Alon Y. Halevy

claim paper

Read More »

164

click to vote

ERCIMDL
2005
Springer

115views Education» more ERCIMDL 2005»

A No-Compromises Architecture for Digital Document Preservation

16 years 10 days ago

Download multivalent.sourceforge.net

Abstract. The Multivalent Document Model offers a practical, proven, nocompromises architecture for preserving digital documents of potentially any data format. We have implemented...

Thomas A. Phelps, Paul B. Watry

claim paper

Read More »

153

click to vote

DOCENG
2007
ACM

134views Document Analysis» more DOCENG 2007»

Extracting reusable document components for variable data printing

15 years 10 months ago

Download eprints.nottingham.ac.uk

Variable Data Printing (VDP) has brought new flexibility and dynamism to the printed page. Each printed instance of a specific class of document can now have different degrees of ...

Steven R. Bagley, David F. Brailsford, James A. Ol...

claim paper

Read More »

182

click to vote

RIAO
2000

104views Information Technology» more RIAO 2000»

Combining linguistic and spatial information for document analysis

15 years 8 months ago

Download www.cs.rug.nl

We present a framework to analyze color documents of complex layout. In addition, no assumption is made on the layout. Our framework combines in a content-driven bottom-up approac...

Marco Aiello, Christof Monz, Leon Todoran

claim paper

Read More »

« Prev « First page 34 / 101 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers