Search Sciweavers | Sciweavers

498 search results - page 11 / 100

» Robust web content extraction

147

click to vote

ICMCS
2007
IEEE

140views Multimedia» more ICMCS 2007»

Audio Signature Extraction Based on Projections of Spectrograms

16 years 1 months ago

Download www.clausbauer.com

Content-based signatures are designed to be a robust bitstream representation of the content so as to enable content identi cation even though the original content may go through ...

Regunathan Radhakrishnan, Claus Bauer, Corey Cheng...

claim paper

Read More »

168

click to vote

IIWAS
2008

160views Internet Technology» more IIWAS 2008»

Combining content extraction heuristics: the CombinE system

15 years 8 months ago

Download www.informatik.uni-mainz.de

The main text content of an HTML document on the WWW is typically surrounded by additional contents, such as navigation menus, advertisements, link lists or design elements. Conte...

Thomas Gottron

claim paper

Read More »

202

click to vote

DEXA
2006
Springer

197views Database» more DEXA 2006»

Cleaning Web Pages for Effective Web Content Mining

15 years 8 months ago

Download sol.cs.uwindsor.ca

Classifying and mining noise-free web pages will improve on accuracy of search results as well as search speed, and may benefit webpage organization applications (e.g., keyword-bas...

Jing Li, Christie I. Ezeife

claim paper

Read More »

181

click to vote

AUSDM
2006
Springer

160views Data Mining» more AUSDM 2006»

Extraction of Flat and Nested Data Records from Web Pages

15 years 10 months ago

Download crpit.com

This paper deals with studies the problem of identification and extraction of flat and nested data records from a given web page. With the explosive growth of information sources ...

Siddu P. Algur, P. S. Hiremath

claim paper

Read More »

162

Voted

SOFSEM
2007
Springer

156views Theoretical Computer Science» more SOFSEM 2007»

Creating Permanent Test Collections of Web Pages for Information Extraction Research

16 years 24 days ago

Download www.dbai.tuwien.ac.at

In the research area of automatic web information extraction, there is a need for permanent and annotated web page collections enabling objective performance evaluation of differen...

Bernhard Pollak, Wolfgang Gatterbauer

claim paper

Read More »

« Prev « First page 11 / 100 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers