Search Sciweavers | Sciweavers

155 search results - page 24 / 31

» Matching web site structure and content

211

click to vote

WIDM
2003
ACM

130views Internet Technology» more WIDM 2003»

Datarover: a taxonomy based crawler for automated data extraction from data-intensive websites

16 years 22 days ago

Download www.public.asu.edu

The advent of e-commerce has created a trend that brought thousands of catalogs online. Most of these websites are “taxonomy-directed”. A Web site is said to be ``taxonomydire...

Hasan Davulcu, S. Koduri, Saravanakumar Nagarajan

claim paper

Read More »

245

click to vote

SIGIR
2008
ACM

176views Information Technology» more SIGIR 2008»

SpotSigs: robust and efficient near duplicate detection in large web collections

15 years 7 months ago

Download ilpubs.stanford.edu

Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...

Martin Theobald, Jonathan Siddharth, Andreas Paepc...

claim paper

Read More »

186

Voted

ERCIMDL
2005
Springer

113views Education» more ERCIMDL 2005»

mod_oai: An Apache Module for Metadata Harvesting

16 years 1 months ago

Download public.lanl.gov

We describe mod_oai, an Apache 2.0 module that implements the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). The OAI-PMH is the de facto standard for metadata...

Michael L. Nelson, Herbert Van de Sompel, Xiaoming...

claim paper

Read More »

228

click to vote

WWW
2006
ACM

166views Internet Technology» more WWW 2006»

Bootstrapping semantics on the web: meaning elicitation from schemas

16 years 8 months ago

Download www.dit.unitn.it

In most web sites, web-based applications (such as web portals, emarketplaces, search engines), and in the file systems of personal computers, a wide variety of schemas (such as t...

Paolo Bouquet, Luciano Serafini, Stefano Zanobini,...

claim paper

Read More »

174

click to vote

AAAI
2006

148views Intelligent Agents» more AAAI 2006»

Bookmark Hierarchies and Collaborative Recommendation

15 years 9 months ago

Download informatics.indiana.edu

GiveALink.org is a social bookmarking site where users may donate and view their personal bookmark files online securely. The bookmarks are analyzed to build a new generation of i...

Benjamin Markines, Lubomira Stoilova, Filippo Menc...

claim paper

Read More »

« Prev « First page 24 / 31 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers