Sciweavers

2554 search results - page 117 / 511
» Keyword query cleaning
Sort
View
WWW
2005
ACM
14 years 11 months ago
The volume and evolution of web page templates
Web pages contain a combination of unique content and template material, which is present across multiple pages and used primarily for formatting, navigation, and branding. We stu...
David Gibson, Kunal Punera, Andrew Tomkins
WSDM
2010
ACM
215views Data Mining» more  WSDM 2010»
14 years 7 months ago
Boilerplate Detection using Shallow Text Features
In addition to the actual content Web pages consist of navigational elements, templates, and advertisements. This boilerplate text typically is not related to the main content, ma...
Christian Kohlschütter, Peter Fankhauser, Wol...
CIKM
2009
Springer
14 years 4 months ago
Mining tourist information from user-supplied collections
Tourist photographs constitute a large part of the images uploaded to photo sharing platforms. But filtering methods are needed before one can extract useful knowledge from noisy ...
Adrian Popescu, Gregory Grefenstette, Pierre-Alain...
SMA
2009
ACM
228views Solid Modeling» more  SMA 2009»
14 years 4 months ago
Robust mesh reconstruction from unoriented noisy points
We present a robust method to generate mesh surfaces from unoriented noisy points in this paper. The whole procedure consists of three steps. Firstly, the normal vectors at points...
Hoi Sheung, Charlie C. L. Wang
SIGIR
2006
ACM
14 years 4 months ago
Near-duplicate detection by instance-level constrained clustering
For the task of near-duplicated document detection, both traditional fingerprinting techniques used in database community and bag-of-word comparison approaches used in information...
Hui Yang, James P. Callan