Sciweavers

2677 search results - page 48 / 536
» Extracting Structured Data from Web Pages
Sort
View
WWW
2007
ACM
14 years 9 months ago
Detecting near-duplicates for web crawling
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma
ACMSE
2005
ACM
14 years 2 months ago
The bipartite clique: a topological paradigm for WWWeb user search customization
Web user search customization research has been fueled by the recognition that if the WWW is to attain to its optimal potential as an interactive medium the development of new and...
Brenda F. Miles, Vir V. Phoha
CORR
2010
Springer
193views Education» more  CORR 2010»
13 years 7 months ago
A Probabilistic Approach for Learning Folksonomies from Structured Data
Learning structured representations has emerged as an important problem in many domains, including document and Web data mining, bioinformatics, and image analysis. One approach t...
Anon Plangprasopchok, Kristina Lerman, Lise Getoor
ICN
2001
Springer
14 years 1 months ago
The Influence of Web Page Images on the Performance of Web Servers
In recent years World Wide Web traffic has shown phenomenal growth. The main causes are the continuing increase in the number of people navigating the Internet and the creation of ...
Cristina Hava Muntean, Jennifer McManis, John Murp...
ICDAR
2003
IEEE
14 years 1 months ago
Identifying Story and Preview Images in News Web Pages
The World Wide Web provides an increasingly powerful and popular publication mechanism. Web documents often contain a large number of images serving various different purposes. Th...
Jianying Hu, Amit Bagga