Search Sciweavers | Sciweavers

2190 search results - page 103 / 438

» Unweaving a web of documents

155

click to vote

ICDAR
2009
IEEE

148views Document Analysis» more ICDAR 2009»

User-Guided Wrapping of PDF Documents Using Graph Matching Techniques

16 years 1 days ago

Download www.cvc.uab.es

There are a number of established products on the market for wrapping—semi-automatic navigation and extraction of data—from web pages. These solutions make use of the inherent...

Tamir Hassan

claim paper

Read More »

142

Voted

IEEEICCI
2002
IEEE

97views Artificial Intelligence» more IEEEICCI 2002»

An Agent-Assisted Document Storage for Software Process Environments

15 years 10 months ago

Download www.semgrid.net

Traditional software process environment stores documents using either centralized or distributed approach. With the assistance of web agent, this paper presents a new document st...

Jason Jen-Yen Chen, Chun-Yi Lin

claim paper

Read More »

127

click to vote

DEXAW
2008
IEEE

123views Database» more DEXAW 2008»

Text Extraction from the Web via Text-to-Tag Ratio

15 years 11 months ago

Download www.uni-weimar.de

– We describe a method to extract content text from diverse Web pages by using the HTML document’s Text-to-Tag Ratio rather than specific HTML cues that may not be constant acr...

Tim Weninger, William H. Hsu

claim paper

Read More »

266

click to vote

ICDE
2008
IEEE

218views Database» more ICDE 2008»

AxPRE Summaries: Exploring the (Semi-)Structure of XML Web Collections

16 years 6 months ago

Download www.cs.toronto.edu

The nature of semistructured data in web collections is evolving. Increasingly, XML web documents (or documents exchanged via web services) are valid with regard to a schema, yet ...

Mariano P. Consens, Flavio Rizzolo, Alejandro A. V...

claim paper

Read More »

255

click to vote

CIKM
2009
Springer

226views Information Technology» more CIKM 2009»

Improving web page classification by label-propagation over click graphs

15 years 12 months ago

Download www.patrickpantel.com

In this paper, we present a semi-supervised learning method for web page classification, leveraging click logs to augment training data by propagating class labels to unlabeled si...

Soo-Min Kim, Patrick Pantel, Lei Duan, Scott Gaffn...

claim paper

Read More »

« Prev « First page 103 / 438 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers