Sciweavers

WSDM
2010
ACM
215views Data Mining» more  WSDM 2010»
14 years 8 months ago
Boilerplate Detection using Shallow Text Features
In addition to the actual content Web pages consist of navigational elements, templates, and advertisements. This boilerplate text typically is not related to the main content, ma...
Christian Kohlschütter, Peter Fankhauser, Wol...