Search Sciweavers | Sciweavers

2677 search results - page 113 / 536

» Extracting Structured Data from Web Pages

181

click to vote

ACL
2006

141views Computational Linguistics» more ACL 2006»

A DOM Tree Alignment Model for Mining Parallel Data from the Web

15 years 8 months ago

Download research.microsoft.com

This paper presents a new web mining scheme for parallel data acquisition. Based on the Document Object Model (DOM), a web page is represented as a DOM tree. Then a DOM tree align...

Lei Shi, Cheng Niu, Ming Zhou, Jianfeng Gao

claim paper

Read More »

215

Voted

BIS
2010

159views Business» more BIS 2010»

Comparing Intended and Real Usage in Web Portal: Temporal Logic and Data Mining

15 years 2 months ago

Download liris.cnrs.fr

Nowadays the software systems, including web portals, are developed from a priori assumptions about how the system will be used. However, frequently these assumptions hold only par...

Jérémy Besson, Ieva Mitasiunaite, Au...

claim paper

Read More »

269

click to vote

LWA
2008

220views Software Engineering» more LWA 2008»

Rule-Based Information Extraction for Structured Data Acquisition using TextMarker

15 years 9 months ago

Download ki.informatik.uni-wuerzburg.de

Information extraction is concerned with the location of specific items in (unstructured) textual documents, e.g., being applied for the acquisition of structured data. Then, the ...

Martin Atzmüller, Peter Klügl, Frank Pup...

claim paper

Read More »

214

click to vote

WSDM
2009
ACM

172views Data Mining» more WSDM 2009»

Clustering the tagged web

16 years 2 months ago

Download www.stanford.edu

Automatically clustering web pages into semantic groups promises improved search and browsing on the web. In this paper, we demonstrate how user-generated tags from largescale soc...

Daniel Ramage, Paul Heymann, Christopher D. Mannin...

claim paper

Read More »

230

click to vote

PVLDB
2008

141views more PVLDB 2008»

WebTables: exploring the power of tables on the web

15 years 7 months ago

Download turing.cs.washington.edu

The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...

Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...

claim paper

Read More »

« Prev « First page 113 / 536 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers