Sciweavers

IIWAS
2008
14 years 29 days ago
Combining content extraction heuristics: the CombinE system
The main text content of an HTML document on the WWW is typically surrounded by additional contents, such as navigation menus, advertisements, link lists or design elements. Conte...
Thomas Gottron