Search Sciweavers | Sciweavers

708 search results - page 12 / 142

» Identifying Content Blocks from Web Documents

175

click to vote

ESORICS
2011
Springer

161views Security Privacy» more ESORICS 2011»

Protecting Private Web Content from Embedded Scripts

14 years 2 months ago

Download www.cs.virginia.edu

Many web pages display personal information provided by users. The goal of this work is to protect that content from untrusted scripts that are embedded in host pages. We present a...

Yuchen Zhou, David Evans

claim paper

Read More »

108

click to vote

DEXAW
2008
IEEE

123views Database» more DEXAW 2008»

Text Extraction from the Web via Text-to-Tag Ratio

15 years 9 months ago

Download www.uni-weimar.de

– We describe a method to extract content text from diverse Web pages by using the HTML document’s Text-to-Tag Ratio rather than specific HTML cues that may not be constant acr...

Tim Weninger, William H. Hsu

claim paper

Read More »

166

click to vote

CIKM
2008
Springer

138views Information Technology» more CIKM 2008»

Identifying table boundaries in digital documents via sparse line detection

15 years 5 months ago

Download chemxseer.ist.psu.edu

Most prior work on information extraction has focused on extracting information from text in digital documents. However, often, the most important information being reported in an...

Ying Liu, Prasenjit Mitra, C. Lee Giles

claim paper

Read More »

153

Voted

CIKM
2010
Springer

225views Information Technology» more CIKM 2010»

Automatic metadata extraction from multilingual enterprise content

15 years 1 months ago

Download www.cngl.ie

Enterprises provide professionally authored content about their products/services in different languages for use in web sites and customer care. For customer care, personalization...

Melike Sah, Vincent Wade

claim paper

Read More »

134

click to vote

KDD
2003
ACM

161views Data Mining» more KDD 2003»

Eliminating noisy information in Web pages for data mining

16 years 3 months ago

Download www.cs.uic.edu

A commercial Web page typically contains many information blocks. Apart from the main content blocks, it usually has such blocks as navigation panels, copyright and privacy notice...

Lan Yi, Bing Liu, Xiaoli Li

claim paper

Read More »

« Prev « First page 12 / 142 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers