Search Sciweavers | Sciweavers

224 search results - page 17 / 45

» Syntactic Folding and its Application to the Information Ext...

click to vote

WWW
2005
ACM

150views Internet Technology» more WWW 2005»

Extracting context to improve accuracy for HTML content extraction

14 years 8 months ago

Download www1.cs.columbia.edu

Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...

Suhit Gupta, Gail E. Kaiser, Salvatore J. Stolfo

claim paper

Read More »

click to vote

ICDM
2008
IEEE

143views Data Mining» more ICDM 2008»

Exploiting Data Semantics to Discover, Extract, and Model Web Sources

14 years 1 months ago

Download www.isi.edu

We describe DEIMOS, a system that automatically discovers and models new sources of information. The system exploits four core technologies developed by our group that makes an en...

José Luis Ambite, Craig A. Knoblock, Kristi...

claim paper

Read More »

click to vote

KDD
2009
ACM

228views Data Mining» more KDD 2009»

A generalized Co-HITS algorithm and its application to bipartite graphs

14 years 8 months ago

Download appsrv.cse.cuhk.edu.hk

Recently many data types arising from data mining and Web search applications can be modeled as bipartite graphs. Examples include queries and URLs in query logs, and authors and ...

Hongbo Deng, Michael R. Lyu, Irwin King

claim paper

Read More »

click to vote

WWW
2001
ACM

113views Internet Technology» more WWW 2001»

Crawling the Hidden Web

14 years 8 months ago

Download www.dia.uniroma3.it

Current-day crawlers retrieve content only from the publicly indexable Web, i.e., the set of Web pages reachable purely by following hypertext links, ignoring search forms and pag...

Sriram Raghavan, Hector Garcia-Molina

claim paper

Read More »

click to vote

WWW
2011
ACM

208views Internet Technology» more WWW 2011»

OXPath: little language, little memory, great value

13 years 2 months ago

Download christian.schallhart.net

Data about everything is readily available on the web—but often only accessible through elaborate user interactions. For automated decision support, extracting that data is esse...

Andrew Jon Sellers, Tim Furche, Georg Gottlob, Gio...

claim paper

Read More »

« Prev « First page 17 / 45 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers